how to train own dataset?

wlz987 commented 1 month ago

怎么训练自定义数据集呢

LSIbabnikz commented 1 month ago

To train on a custom dataset, you will need to first generate quality scores using DifFIQA or possibly any other FIQA technique and embeddings using the provided CosFace model.

After that you can follow the instructions for training the model in the README, replacing the provided files, with the ones you generated yourself. Make sure that the generated files have the same format as the provided ones. Additionally, you may need to make some minor tweaks in prepare_data.py, depending on the structure of the used dataset (probably in the following code snippet, regarding the identity variable):

https://github.com/LSIbabnikz/eDifFIQA/blob/d2c4b610bb1dc38236b3d49928bdbd810b8a3709/prepare_data.py#L37-L41

phonPtan commented 1 week ago

Hello @LSIbabnikz,

I was generate quality scores by run [https://github.com/LSIbabnikz/DifFIQA/tree/main/diffiqa](python inference.py -c ./configs/inference_config.yaml), but no embeddings generated. So how can i save embeddings

Thank you

LSIbabnikz commented 1 week ago

Hi,

the code you included seems to reference DifFIQA rather than eDifFIQA. To generate not only the quality score but also the embeddings of the input samples, you will need to do the following. First you will need to alter the config file of the used model, for example if you are using eDifFIQA(L) then alter the _ediffiqaLconfig.yaml file, by changing the value of __return_feat__ from 0 to 1.

https://github.com/LSIbabnikz/eDifFIQA/blob/d2c4b610bb1dc38236b3d49928bdbd810b8a3709/configs/ediffiqaL_config.yaml#L25-L28

After that, you will need to also make some changes to the inference.py script. Since we changed the __return_feat__ flag in the config file, the model should now return a tuple of values (quality scores and embeddings). You will need to accumulate the embeddings over all batches and save them, similarly as we do now with the quality scores.

https://github.com/LSIbabnikz/eDifFIQA/blob/d2c4b610bb1dc38236b3d49928bdbd810b8a3709/inference.py#L33-L45

Hopefully the explanation is clear enough. If not, please do not hesitate to ask any follow-up questions.

LSIbabnikz / eDifFIQA

how to train own dataset? #4