jina-ai / examples

Jina examples and demos to help you get started
https://docs.jina.ai
Apache License 2.0
454 stars 142 forks source link

many-examples: remove kaggle dependency #544

Open alexcg1 opened 3 years ago

alexcg1 commented 3 years ago

As discussed in various meetings with @lusloher , @aga11313 , @FionnD

Kaggle is a lot of hoops for a user to jump through just to get an example working: install, set up key, run data getter script.

It's also work for us: We have to ensure datasets haven't moved or changed a lot, and we sometimes have to perform extra steps to process them.

These datasets are generally under creative commons licenses or similar. There's no reason why we can't:

Affected examples

FionnD commented 3 years ago

Thanks for creating the issue Alex!

Just to clarify to any engineer. ⚠️This issue should not be worked until https://github.com/jina-ai/examples/issues/447 and https://github.com/jina-ai/examples/issues/512 are completed. ⚠️

nan-wang commented 3 years ago

audio-search has no longer dependency on kaggle

jakobkruse1 commented 3 years ago

Where could we store the example data? Do we have "somewhere we control" to download from?

tadejsv commented 3 years ago

I propose to use, when possible, huggingface datasets. They are extremely easy to use, and very performant too.