maragkakislab / rnakinet-pub

Other
1 stars 0 forks source link

TypeError: 'NoneType' object is not iterable #3

Closed mnsmar closed 1 year ago

mnsmar commented 1 year ago

I used the command python3 rnamodif/evaluation/run.py --workers 10 --datadir fast5/ --outfile out.tab --model 5eu_v1 but it seems to fail at the last step.

using 10 workers
Lightning automatically upgraded your loaded checkpoint from v1.7.0 to v1.9.1. To apply the upgrade to your files permanently, run `python -m pytorch_lightning.utilities.upgrade_checkpoint --file rnamodif/checkpoints_pl/5eu_nih_light_conv/last.ckpt`
Using 16bit None Automatic Mixed Precision (AMP)
GPU available: True (cuda), used: True
TPU available: False, using: 0 TPU cores
IPU available: False, using: 0 IPUs
HPU available: False, using: 0 HPUs
Initializing distributed: GLOBAL_RANK: 0, MEMBER: 1/3
Initializing distributed: GLOBAL_RANK: 1, MEMBER: 2/3
Initializing distributed: GLOBAL_RANK: 2, MEMBER: 3/3
----------------------------------------------------------------------------------------------------
distributed_backend=nccl
All distributed processes registered. Starting with 3 processes
----------------------------------------------------------------------------------------------------

LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0,1,2]
LOCAL_RANK: 1 - CUDA_VISIBLE_DEVICES: [0,1,2]
LOCAL_RANK: 2 - CUDA_VISIBLE_DEVICES: [0,1,2]
/home/maragkakise2/conda/envs/rnamodif/lib/python3.9/site-packages/pytorch_lightning/trainer/connectors/data_connector.py:208: UserWarning: num_workers>0, persistent_workers=False, and strategy=ddp_spawn may result in data loading bottlenecks. Consider setting persistent_
workers=True (this is a limitation of Python .spawn() and PyTorch)
  rank_zero_warn(
Predicting: 0it [00:00, ?it/s]/home/maragkakise2/conda/envs/rnamodif/lib/python3.9/site-packages/pytorch_lightning/loops/epoch/prediction_epoch_loop.py:173: UserWarning: Lightning couldn't infer the indices fetched for your dataloader.
  warning_cache.warn("Lightning couldn't infer the indices fetched for your dataloader.")
Predicting DataLoader 0: : 842664it [11:33:45, 20.24it/s]
Traceback (most recent call last):
  File "/home/maragkakise2/workspace/projects/ontml/RNAModif/rnamodif/evaluation/run.py", line 85, in <module>
    main()
  File "/home/maragkakise2/workspace/projects/ontml/RNAModif/rnamodif/evaluation/run.py", line 70, in main
    read_predictions = predictions_to_read_predictions(predictions)
  File "/home/maragkakise2/workspace/projects/ontml/RNAModif/rnamodif/evaluation/run.py", line 18, in predictions_to_read_predictions
    for preds, ids in predictions:
TypeError: 'NoneType' object is not iterable
MartinekV commented 1 year ago

Should be fixed in 53a0a4f. The issue was that the code currently doesn't support multi-gpu inference, and pytorch lightning is too smart and picked it automatically since you have 3 GPUs.

mnsmar commented 1 year ago

okay thanks. Pulled the changes and testing it now. Will close the issue if it runs successfully.

fyi I notice that with 3 GPUs it was doing ~20it/s. With 1 GPU now is doing 19.7it/sec. I did not expect that small difference. I suspect either something wrong with iteration counting or with execution on multiple GPUs. Weird thing is that it was keeping all 3 GPUs at 100%. It previously took 11 hrs to complete. Let's see how long it takes now.