Tencent / HunyuanDiT

Hunyuan-DiT : A Powerful Multi-Resolution Diffusion Transformer with Fine-Grained Chinese Understanding
https://dit.hunyuan.tencent.com/
Other
2.59k stars 180 forks source link

Unclear lora training documentation #125

Closed storyicon closed 10 hours ago

storyicon commented 3 days ago

I think the documentation for Lora training is unclear now. For example, dataset/porcelain/jsons/porcelain.json is mentioned in many places, but this file does not exist. Users who want to understand the data structure of this file can only look at the code. If the documents and tools are bad, I think the ecological prosperity will be blocked.

yuanhaoliang commented 1 day ago

you should use this command line:

 idk base -c dataset/yamls/porcelain.yaml -t dataset/porcelain/jsons/porcelain.json

transform porcelain.yaml to porcelain.json

zml-ai commented 10 hours ago

Hi, we suggest adopting IndexKits for data management, Arrow for data formatting, YAML for data preprocessing, and JSON for training script input.

Please refer to the README for comprehensive data preparation guidelines.