Closed harishanand95 closed 3 months ago
There is a example/HuggingFaceValidation
that use PyCall
to compare the result, but that only work with a complete model, not single layer. If you want to make sure the embedding result is correct, you can follow how the code do and compare the result. Personally I would prefer having both CLIPTextEmbedding and CLIPTextModel in a single PR so the model can be tested directly.
For now (0.1.x releases), I won't add CI test for huggingface models. you can open a example/CLIP
and put the test_clip2.jl
there.
Yeah, I will add CLIPTextModel too in this PR, so we can test it completely. Thanks for taking a look!
Hi @chengchingwen, I have marked with TODO
in model.jl where I'm not sure how to proceed. I have added causal attention masks, but I think it needs to combined with attention masks which are passed as inputs.
Model gets loaded fine, and the results are slightly off, I think its due to how layer normalization code in Flux.normalize & torch.normalize differs. Let me know what you think.
Thanks!
Hey, I would like to add CLIPTextEmbedding, and later CLIPTextModel to Transformers.jl. While doing so, I want to learn about Transformers.jl implementation details and also learn how we compare pytorch model results with julia results.
Based on https://github.com/huggingface/transformers/blob/f68796bd603ef60173e093f50a1ecbb74bc2ba6b/src/transformers/models/clip/modeling_clip.py#L200
I have added a test_clip2.jl which is similar to the python code below.
Please suggest corrections, improvement and better approaches to doing this, Thanks!