Describe the bug

This is unfiltered "first impression" playing with some of the new jupyter notebooks. The notebooks are great, but there's a lot of overlap across them, is that by design? For example, I the notebooks in docs\getting_started\notebooks\cv\pl_trainer.ipynb seem to be low level and doesn't show the connection to archai so much as algos.ipynb which ties things together and also shows how to create the MnistDatasetProvider already, so is this redundancy by design or is it just a work in progress?

Feedback on docs\getting_started\notebooks\cv\pl_trainer.ipynb

Some typos :

Archai's offers => Archai offers
and exposes proper name for methods => ??? Not clear what this means, is it renaming methods so they fit our discrete_search interface?
searches' evaluators => the possesive noun is a bit clunky, especially with a lower case noun, in plural possessive format, I think it reads better as just "search evaluators".

Every time I execute this block the val_loss increases:

After about 10 runs I see this:

If I re-execute the initial code blocks to recreate the dataset and model it drops back to around 2. Is the model stateful or something? Perhaps model = Model() should be in the last block not the block that defines the model?

When I set max_epochs=100 instead of max_steps=1, the progress bar output is confusing as it says "1/1":

But I think it did do the 100 epochs?

When I ask for the test dataset: test_dataset = dataset_provider.get_test_dataset() I get a weird error saying archai.datasets.cv.mnist_dataset_provider — WARNING — Testing set not available formnist. Returning validation set which is weird. It would be nice if we could show proper procedure here even with MNist and add the test code block trainer.test(model, DataLoader(test_dataset)) at the end.
It is unfortunate that the PlTrainer takes "accelerator='gpu'" whereas the PartialTrainingValAccuracy in algos.ipynb takes device="cuda"...

I would say that the redundancy is a work in progress since I drafted these notebooks out of my mind, without actually knowing that they would be sufficient for users. Nevertheless, I am glad that you are offering great feedback and I will definitely improve their design and connection to resemble the ones that Piero has implemented for discrete search.

Answer to feedbacks

Fixed.
Yes, model = Model() should be in the last cell otherwise it is just accumulating the losses.`
Did you, by any chance, removed these arguments limit_train_batches=1, limit_test_batches=1, limit_predict_batches=1? They are probably limiting the number of batches that are being passed during training, and even though, the max_steps or max_epochs is set to a higher number, they take precedence and limit the training to a single batch (would explain the progress bar with only 1/1.)
By default, MNIST only offers two sets, which in our case is being used as training and validation. Do you think it's worth splitting the validation into validation/test and them allowing get_test_dataset() to be invoked? Or should we just improve the warning message?
Unfortunately the accelerator=gpu is an argument from the PyTorch-Lightning trainer. We could do one of the following:
- Force a keyword argument to PlTrainer with device and use it as the accelerator in the class constructor.
- Change PartialTrainingValAccuracy to use accelerator to device.
What do you think it would be better?

microsoft / archai

[BUG] some minor notebook bugs #155

Feedback on docs\getting_started\notebooks\cv\pl_trainer.ipynb

Answer to feedbacks