activeloopai / deeplake

Database for AI. Store Vectors, Images, Texts, Videos, etc. Use with LLMs/LangChain. Store, query, version, & visualize any AI data. Stream data in real-time to PyTorch/TensorFlow. https://activeloop.ai
https://activeloop.ai
Mozilla Public License 2.0
8.07k stars 616 forks source link

Create a tutorial on Colab #64

Closed davidbuniat closed 3 years ago

davidbuniat commented 3 years ago

Create a tutorial on Colab

Users should be able to load a dataset, train a model, and upload the dataset. Feel free to start from a small example and then make the example comprehensive.

Paola351 commented 3 years ago

I can do this i you need help

keshav340 commented 3 years ago

can I do this? It is easy for me.

davidbuniat commented 3 years ago

can I do this? It is easy for me.

Assigning you as well, to avoid replication of the task, please think of the extended introduction of Hub to an user.

Paola351 commented 3 years ago

Sorry, i have to give up as I'm very busy in these days, sorry!

davidbuniat commented 3 years ago

Sorry, i have to give up as I'm very busy in these days, sorry!

No worries at all, thanks for thinking of us!

Tanujcbe commented 3 years ago

@davidbuniat Can I work on it?

davidbuniat commented 3 years ago

Thanks for your willingness to contribute. @Tanujcbe assigning you the task.

Tanujcbe commented 3 years ago

@davidbuniat Please accept pull request #83

davidbuniat commented 3 years ago

@Tanujcbe thanks for the PR. Probably the issue was not clear enough from the beginning.

We are looking for a Hub tutorial on Google Colab to demonstrate the user Hub in action, e.g. how users can create datasets and load them into pytorch/tensorflow.

Finally we look for adding the created notebook into the readme.

thisiseshan commented 3 years ago

Hi. Is this issue resolved? I find this repo interesting and would wish to contribute to this issue.

davidbuniat commented 3 years ago

@thisiseshan thanks for your interest, no this issue is still pending, would love your help!

thisiseshan commented 3 years ago

Thank you for assigning this to me, would MNIST classification be an appropriate example?

muskanlalit18 commented 3 years ago

Hi, this looks interesting. I would like to work on it too

davidbuniat commented 3 years ago

Thank you for assigning this to me, would MNIST classification be an appropriate example?

@thisiseshan yes, as long as you the tutorial demonstrates in very simple terms the main values of the package Mnist is fine. @muskanlalit18 assigning you the task as well, please feel free to choose another dataset.

drashtipatel2503 commented 3 years ago

Hii, can I go for this?

muskanlalit18 commented 3 years ago

@muskanlalit18 assigning you the task as well, please feel free to choose another dataset.

Thanks for assigning it to me, should I try CIFAR classification example?

AbhinavTuli commented 3 years ago

Sure @muskanlalit18 CIFAR sounds good! Feel free to post here or in the discussions https://github.com/activeloopai/Hub/discussions if you need any help.

thisiseshan commented 3 years ago

Hi, I am training my model on MNIST dataset but it's taking way too long on Colab (3 hrs per epoch). The model seems to be running fine.

I wanted to know

davidbuniat commented 3 years ago

Hi, I am training my model on MNIST dataset but it's taking way too long on Colab (3 hrs per epoch). The model seems to be running fine.

I wanted to know

  • Do datasets from hub take a bit longer to train?
  • Should I change the model?
  • Should I proceed to the upload dataset part and send a PR so you can review it?

It is not supposed to take that long, can you please proceed with the PR with collab link. I will give a try and see what's wrong.

thisiseshan commented 3 years ago

Hi! Please review my PR and let me know your thoughts. I can make necessary changes.

davidbuniat commented 3 years ago

@thisiseshan added comments there, please also take a look at other examples of colabs to demostrate core values of hub.

thisiseshan commented 3 years ago

Hi, Current functionality implemented to Colab is: [PyTorch]

Should the 'upload dataset' be integrated to this notebook. If Yes, What dataset is to be uploaded as an example (should I use the same example as in docs)?

from hub import dataset, tensor

tensor1 = tensor.from_zeros((20,512,512), dtype="uint8", dtag="image")
tensor2 = tensor.from_zeros((20), dtype="bool", dtag="label")

dataset.from_tensors({"name1": tensor1, "name2": tensor2})

dataset.store("username/namespace")
davidbuniat commented 3 years ago

@thisiseshan for uploading to work, the user would need to authenticated inside collab. We don't have yet token so my guess would be tricky to do with colab.

I think this would be out of scope for the first tutorial. We can do a separate PR for uploading the dataset with colab and add any required functionality :)

thisiseshan commented 3 years ago

Okay perfect!

thisiseshan commented 3 years ago

Hi. I have made the fixes. Please let me know your thoughts and any changes you need :)

mikayelh commented 3 years ago

Hi, @muskanlalit18 ! Hope this finds you well.

Dropping a note to check in on you and ask if you need a hand with uploading the dataset. Feel free to ask us in the GitHub Discussions (we have beta access!) or our dedicated Slack channel. Thanks a mil!

mikayelh commented 3 years ago

Hi @thisiseshan! Thanks so much for following up. We will be reviewing the code shortly. Meanwhile, if there is any other task in the project board you'd like to work on, feel free to let us know!

thisiseshan commented 3 years ago

Sure, Thank you!

mikayelh commented 3 years ago

thank YOU, @thisiseshan :)

muskanlalit18 commented 3 years ago

Hi, @muskanlalit18 ! Hope this finds you well.

Dropping a note to check in on you and ask if you need a hand with uploading the dataset. Feel free to ask us in the GitHub Discussions (we have beta access!) or our dedicated Slack channel. Thanks a mil!

Hi, I'm working on it, I shall ask in the Discussions if I need any help.

mikayelh commented 3 years ago

Hi, @muskanlalit18, thanks!

davidbuniat commented 3 years ago

@muskanlalit18 please take a look at @thisiseshan contribution colab and feel free to think of another exciting tutorial to demonstrate hub capabilities. e.g. uploading a dataset or more advanced use case such as training coco.

mynameisvinn commented 3 years ago

Closing this due to inactivity, will reopen if theres widespread interest.