working on google colab

hendrycks / anomaly-seg

The Combined Anomalous Object Segmentation (CAOS) Benchmark

MIT License

154 stars 20 forks source link

working on google colab #9

Closed julienguegan closed 4 years ago

julienguegan commented 4 years ago

Does the code work on google colab ?

I have this error "KeyError: Caught KeyError in DataLoader worker process 0." when calling train(segmentation_module, iterator_train, optimizers, history, epoch+1, cfg) after running the command python3 train.py --gpu 1 ...

hendrycks commented 4 years ago

We have not tried using this on Google Colab.

On Tue, Jul 7, 2020 at 3:58 AM Julien Guégan notifications@github.com wrote:

Does the code work on google colab ?

Iha ve this error "KeyError: Caught KeyError in DataLoader worker process 0." when calling train(segmentation_module, iterator_train, optimizers, history, epoch+1, cfg) after running the command python3 train.py --gpu 1 ...

https://colab.research.google.com/drive/1mzgBaWtCpfbojTqBIyYqCMAgYCicZKq8?usp=sharing http://url

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/hendrycks/anomaly-seg/issues/9, or unsubscribe https://github.com/notifications/unsubscribe-auth/ACZBITQVREUMTQJET6W6CS3R2L5UHANCNFSM4OSWG6PA .

julienguegan commented 4 years ago

Note that I follow the recommanded steps in the README and actually it may come from python create_dataset.py step, I have 'FileNotFoundError: [Errno 2] No such file or directory: 'data/train/images/test/' ... Could you give some more indications on how the repositories should be ?

hendrycks commented 4 years ago

@xksteven

xksteven commented 4 years ago

I updated the repo. The error should be gone now. Pull it again or download the updated create_dataset.py and defaults file. Thanks for the bug report.

@julienguegan

julienguegan commented 4 years ago

Thanks for the update @xksteven . The previous error have disappeared but I think there is still something missing because I can't make the command python train.py works... I had to add feed_dict = feed_dict[0] in line 32 of models.py because I had a list and not a dictionnary but then I have "Input type (torch.FloatTensor) and weight type (torch.cuda.FloatTensor) should be the same". You can check link_colab if you have time

PS : I also noted one minor error in eval_ood.py : in_scores = - conf[np.logical_not(out_label))] line 51, there is one parenthesis too many

xksteven commented 4 years ago

train.py needs to be modified as per this issue to run on a single GPU.

to run on a single GPU it's

train.py --gpu 0

as opposed to train.py --gpu 1 .

I've also uploaded the pretrained models that can be found here

Finally to run the eval_ood.py if using the default config/ade20k-resnet50dilated-ppm_deepsup.yaml and the model I've uploaded, change the num_classes parameter in the yaml file to be 13.

Thanks for pointing out the typo. I fixed it and another as well.

I've also updated the README as well to reflect these changes