Open manerotoni opened 9 months ago
Fortunately it works with 3.8 as the version of python used on BAND. I tried the class definition in a file and then it works nicely.
May be after the course we should make sure to fix this. I am not sure if this is a bug from multiprocessing module or a feature.
Similar problem with 3.8.10 (identical python version as on BAND).
It is all a little strange as there must be some Windows/package issues. I have been using very similar code in other projects and never had this error.
I move my changes on BAND now to wrap the course work
Good to know that this issue exists on windows.
This is most likely because multiprocessing works differently on Windows. We probably need some workaround that imports the dataset from utils.py
for that case.
P.s I can't really fix this, I don't have access to a Windows Machine. We can see how to address this after the course.
I just add this link https://bobswinkels.com/posts/multiprocessing-python-windows-jupyter/ Basically the best option would be to outsource the function in an extra python file to be imported. It is a little unfortunate that the error only appears on the command windows and not within the jupyter notebook
I found out why in my case the loader did not create problem. If you set num_workers = 0 (use main thread only, which is the default) than it does not complain that does not find the Dataset class.
For the sake of inter OS usability I would remove this option. For the course it is not crucial.
for the moment just the display of images with num_workers >0 is really slow. Not sure why, may be because it needs to start all threads. In fact the time increases with more threads. Not sure if the training is faster, when the threads are all running on the back.
@manerotoni : great that you figured this out. Let's set the number of workers to 0. This is indeed not crucial at all here. (It can make a difference for more complex pipelines but I don't expect a big difference here at all.)
Do you want to create a PR to fix this?
for the moment just the display of images with num_workers >0 is really slow. Not sure why, may be because it needs to start all threads.
Yes, this is slower because all threads need to start.
I will do a PR
Hi, I spotted an issue in the notebooks when using python 3.11 (and may be other versions) on my Windows machine. Somehow the Dataset Class (CustomDataset) when defined in the notebook (e.g. torch_infection_classifier.ipynb) issues an error upon the
The error on the command windows is
A few google search indicate that the somehow the declaration in the notebook is not understood and there is some incompatibilities of multiprocessing and interactive mode (jupyter notebooks) See https://discuss.pytorch.org/t/issue-with-pretrained-resnet-fixed/109637/4 and https://stackoverflow.com/questions/73763151/multiprocessing-error-self-reduction-pickle-loadfrom-parent-attributeerror