Thanks you for the great updating.
Can you details the improvement brought by the augmented the language instructions?
Is it better to understand the language instructions? and have you match each img with a language instruction or just keep the ratio like the previous version?
Thanks you for the great updating. Can you details the improvement brought by the augmented the language instructions? Is it better to understand the language instructions? and have you match each img with a language instruction or just keep the ratio like the previous version?