deep-learning-with-pytorch / dlwpt-code

Code for the book Deep Learning with PyTorch by Eli Stevens, Luca Antiga, and Thomas Viehmann.
https://www.manning.com/books/deep-learning-with-pytorch
4.69k stars 1.98k forks source link

Unable to Run training.py ch11. CandidateInfo_List assertion error. #114

Open evilgangsta opened 5 months ago

evilgangsta commented 5 months ago

AssertionError Traceback (most recent call last) Cell In[7], line 1 ----> 1 run('p2ch11.training.LunaTrainingApp', '--epochs=1')

Cell In[2], line 7 4 log.info("Running: {}({!r}).main()".format(app, argv)) 6 app_cls = importstr(*app.rsplit('.', 1)) # <2> ----> 7 app_cls(argv).main() 9 log.info("Finished: {}.{!r}).main()".format(app, argv))

File ~/Documents/deeplearningwithpytorch/dlwpt-code-master/p2ch11/training.py:140, in LunaTrainingApp.main(self) 137 def main(self): 138 log.info("Starting {}, {}".format(type(self).name, self.cli_args)) --> 140 train_dl = self.initTrainDl() 141 val_dl = self.initValDl() 143 for epoch_ndx in range(1, self.cli_args.epochs + 1):

File ~/Documents/deeplearningwithpytorch/dlwpt-code-master/p2ch11/training.py:90, in LunaTrainingApp.initTrainDl(self) 89 def initTrainDl(self): ---> 90 train_ds = LunaDataset( 91 val_stride=10, 92 isValSet_bool=False, 93 ) 95 batch_size = self.cli_args.batch_size 96 if self.use_cuda:

File ~/Documents/deeplearningwithpytorch/dlwpt-code-master/p2ch11/dsets.py:171, in LunaDataset.init(self, val_stride, isValSet_bool, series_uid, sortby_str) 169 elif val_stride > 0: 170 del self.candidateInfo_list[::val_stride] --> 171 assert self.candidateInfo_list 173 if sortby_str == 'random': 174 random.shuffle(self.candidateInfo_list)

AssertionError:

Processor - i5-10500H 6 cores GPU - GTX 1650 Ram - 16Gb

I have downloaded the Luna Dataset on my external hard drive and the code resides on my internal ssd. I am not certain if that is causing the issue. If so how should i change my code? I am using the same code provided in the github repo and nothing has been tampered with (if my memory does not fail me).

LYK0520 commented 5 months ago

i meet the same question

evilgangsta commented 5 months ago

i meet the same question

I was partially able to solve this by placing the code in the same hard drive as the dataset but the limiting speed of hdd is causing slowdown in caching and training

LYK0520 commented 5 months ago

I have extracted the dataset and placed it in the "data\part2\luna" directory, but I still cannot resolve the aforementioned issue. The error message that appears is as follows:

--> 171 assert self.candidateInfo_list

LYK0520 commented 5 months ago

i solved this, you can run p2ch11.prepcache.LunaPrepCacheApp first to accelerate the train speed. num_workers can be raised if the cpu is not fully occupied

evilgangsta commented 5 months ago

i solved this, you can run p2ch11.prepcache.LunaPrepCacheApp first to accelerate the train speed. num_workers can be raised if the cpu is not fully occupied

The precache app is just for filling the cache so that 1st epochs don't have to load the data. The assertion error for me was because the dsets.py was not able to find the data located in my external hard drive. The num_workers I guess is equal to the number of cpu cores that the machine has. Also cna you please let me know your caching time and machine specs. Mine is taking awfully long if its more than 5 subsets (More than 14 hours just for caching for 7 subsets ;) ).