many-examples: remove kaggle dependency

alexcg1 commented 3 years ago

As discussed in various meetings with @lusloher , @aga11313 , @FionnD

Kaggle is a lot of hoops for a user to jump through just to get an example working: install, set up key, run data getter script.

It's also work for us: We have to ensure datasets haven't moved or changed a lot, and we sometimes have to perform extra steps to process them.

These datasets are generally under creative commons licenses or similar. There's no reason why we can't:

Download a subset for example purposes (this keeps things light)
Process that subset ourselves (saves users time and effort)
Store it either in data/ (for light stuff like text which can go directly in repo) or use get_data.sh to download from somewhere we control (for larger stuff like images)