rom1504 / embedding-reader

Efficiently read embedding in streaming from any filesystem
MIT License
92 stars 19 forks source link

How do you prepare the local dataset? #35

Closed zhenzi0322 closed 1 year ago

zhenzi0322 commented 1 year ago

I want to prepare a dataset similar to the one in the link below

https://mystic.the-eye.eu/public/AI/cah/laion5b/embeddings/laion2B-en/

img_emb_0005.npy

�NUMPY v {'descr': '<f2', 'fortran_order': False, 'shape': (938705, 768), } 

What does 938705 stand for? What does 768 stand for?

rom1504 commented 1 year ago

938705 is the number of rows 768 is the dimension of the embeddings

you can use numpy to save embeddings

zhenzi0322 commented 1 year ago

938705 is the number of rows 768 is the dimension of the embeddings

you can use numpy to save embeddings

clip-retrieval inference --input_dataset image_folder --output_folder embeddings_folder

How can the above command not convert the. npy file

rom1504 commented 1 year ago

Check clip-retrieval readme, feel free to open an issue there

On Tue, Dec 20, 2022, 09:19 振子 @.***> wrote:

938705 is the number of rows 768 is the dimension of the embeddings

you can use numpy to save embeddings

clip-retrieval inference --input_dataset image_folder --output_folder embeddings_folder

How can the above command not convert the. npy file

— Reply to this email directly, view it on GitHub https://github.com/rom1504/embedding-reader/issues/35#issuecomment-1358988157, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAR437UVPB7M3KHV5RDYY73WOFTXXANCNFSM6AAAAAATAONU4I . You are receiving this because you modified the open/close state.Message ID: @.***>