modAL-python / modAL

A modular active learning framework for Python
https://modAL-python.github.io/
MIT License
2.23k stars 324 forks source link

Train in batches : PyTorch models #145

Open tarunn2799 opened 2 years ago

tarunn2799 commented 2 years ago

Hey, I'm a beginner to active learning and I'm just trying to implement modal along with a standard PyTorch softmax classifier using the skorch api as instructed in your documentation.

As a part of your example you specify doing a X, y = next(iter(dataloader)) with the batch size set to the dataset size, in order to get all the images into a numpy array which is later used for training through uncertainty sampling.

Now, this works for me for small datasets (~5k images) but when I tried it with anything more than that, this becomes extremely memory inefficient and my instance crashes. Can you please (preferably in a slightly simpler way) explain how I could potentially implement Active Learning to happen in batches? I realized this entails making changes in several places including but not limited to, the creation of the ActiveLearner object and training for 10 epochs, querying from a single big vector of images at every nth query, and the like. Please help me out here, modal works PERFECTLY for my usecase and I really want to get this working.

Thanks in advance.

tarun-msd commented 2 years ago

@cosmic-cortex Hello, can you please help me out here? TIA

geantrindade commented 1 year ago

I too would really love to see some solutions for this...