Closed iumemon5 closed 9 months ago
Sorry for the confusion, the structure in the readme is correct, that is, in the first stage of VQ-VAE pre-training, only 3000 training characters in the content font are used for training, and the remaining 500 characters are used to test generalization. In the second stage of training font generation, there is no need to divide training characters and test characters in the training set and test set. They are divided by (train_unis and val_unis) when generating tran.json.
So while preparing lmdb, what should be passed to --content_font. Images generated from train_unis.json or trian_val_all_characters.json
python3 build_meta4train.py \ --saving_dir ../results/chinese_dataset/ \ --content_font ../datasets/images/content_font/ \ --train_font_dir ../datasets/images/train \ --val_font_dir ../datasets/images/val \ --seen_unis_file ../meta/train_unis.json \ --unseen_unis_file ../meta/val_unis.json
The second stage of making the lmdb content font directory should contain 3500 (train+val) characters, that is ../datasets/images/content_font/train_val/
.
I got it. Thank you so much for your time and prompt replies.
尊敬的作者你好,能否提供一下你的数据集呢?非常感谢
@Djs-Champion 你好,由于版权的原因,我无法直接提供数据集。
请仔细阅读Readme当中的Data Preparation
部分。lmdb的构建可参考 issue #6 。
你好,请问ipynb文件中的路径是写content里面的还是分别写train和val里面的
这是我的目录结构:
------------------ 原始邮件 ------------------ 发件人: "awei669/VQ-Font" @.>; 发送时间: 2024年3月11日(星期一) 晚上9:26 @.>; @.**@.>; 主题: Re: [awei669/VQ-Font] Dataset Preparation (Issue #9)
@Djs-Champion 你好,由于版权的原因,我无法直接提供数据集。
请仔细阅读Readme当中的Data Preparation部分。lmdb的构建可参考 issue #6 。
— Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you were mentioned.Message ID: @.***>
I am confused about dataset preparation. In Data Preparation, you have mentioned structure as given in the image.
But in one of the issues, you have mentioned the following structure. Can you help me with this?