Closed lovettchris closed 1 year ago
I would say that the redundancy is a work in progress since I drafted these notebooks out of my mind, without actually knowing that they would be sufficient for users. Nevertheless, I am glad that you are offering great feedback and I will definitely improve their design and connection to resemble the ones that Piero has implemented for discrete search.
Fixed.
Yes, model = Model()
should be in the last cell otherwise it is just accumulating the losses.`
Did you, by any chance, removed these arguments limit_train_batches=1, limit_test_batches=1, limit_predict_batches=1
? They are probably limiting the number of batches that are being passed during training, and even though, the max_steps
or max_epochs
is set to a higher number, they take precedence and limit the training to a single batch (would explain the progress bar with only 1/1
.)
By default, MNIST only offers two sets, which in our case is being used as training and validation. Do you think it's worth splitting the validation into validation/test and them allowing get_test_dataset()
to be invoked? Or should we just improve the warning message?
Unfortunately the accelerator=gpu
is an argument from the PyTorch-Lightning trainer. We could do one of the following:
PlTrainer
with device
and use it as the accelerator
in the class constructor.PartialTrainingValAccuracy
to use accelerator
to device
.What do you think it would be better?
Describe the bug
This is unfiltered "first impression" playing with some of the new jupyter notebooks. The notebooks are great, but there's a lot of overlap across them, is that by design? For example, I the notebooks in
docs\getting_started\notebooks\cv\pl_trainer.ipynb
seem to be low level and doesn't show the connection to archai so much asalgos.ipynb
which ties things together and also shows how to create the MnistDatasetProvider already, so is this redundancy by design or is it just a work in progress?Feedback on docs\getting_started\notebooks\cv\pl_trainer.ipynb
After about 10 runs I see this:
If I re-execute the initial code blocks to recreate the dataset and model it drops back to around 2. Is the model stateful or something? Perhaps
model = Model()
should be in the last block not the block that defines the model?But I think it did do the 100 epochs?
When I ask for the test dataset:
test_dataset = dataset_provider.get_test_dataset()
I get a weird error sayingarchai.datasets.cv.mnist_dataset_provider — WARNING — Testing set not available for
mnist. Returning validation set
which is weird. It would be nice if we could show proper procedure here even with MNist and add the test code blocktrainer.test(model, DataLoader(test_dataset))
at the end.It is unfortunate that the PlTrainer takes "accelerator='gpu'" whereas the
PartialTrainingValAccuracy
inalgos.ipynb
takesdevice="cuda"
...