zoogzog / chexnet

Implementation of the CheXNet network (PyTorch)
211 stars 94 forks source link

NaN Error Testing #9

Open BobPetrocelli opened 6 years ago

BobPetrocelli commented 6 years ago

I'm using Py3.6 and Cuda 9.1

Test runs to completion using prebuilt model then I get:

Traceback (most recent call last): File "Main.py", line 81, in main() File "Main.py", line 12, in main runTest() File "Main.py", line 76, in runTest ChexnetTrainer.test(pathDirData, pathFileTest, pathModel, nnArchitecture, nnClassCount, nnIsTrained, trBatchSize, imgtransResize, imgtransCrop, timestampLaunch) File "/media/bob/curie/chexnet-master (2)/ChexnetTrainer.py", line 247, in test aurocIndividual = ChexnetTrainer.computeAUROC(outGT, outPRED, nnClassCount) File "/media/bob/curie/chexnet-master (2)/ChexnetTrainer.py", line 175, in computeAUROC outAUROC.append(roc_auc_score(datanpGT[:, i], datanpPRED[:, i])) File "/home/bob/.local/lib/python3.6/site-packages/sklearn/metrics/ranking.py", line 277, in roc_auc_score sample_weight=sample_weight) File "/home/bob/.local/lib/python3.6/site-packages/sklearn/metrics/base.py", line 75, in _average_binary_score return binary_metric(y_true, y_score, sample_weight=sample_weight) File "/home/bob/.local/lib/python3.6/site-packages/sklearn/metrics/ranking.py", line 272, in _binary_roc_auc_score sample_weight=sample_weight) File "/home/bob/.local/lib/python3.6/site-packages/sklearn/metrics/ranking.py", line 534, in roc_curve y_true, y_score, pos_label=pos_label, sample_weight=sample_weight) File "/home/bob/.local/lib/python3.6/site-packages/sklearn/metrics/ranking.py", line 324, in _binary_clf_curve assert_all_finite(y_score) File "/home/bob/.local/lib/python3.6/site-packages/sklearn/utils/validation.py", line 54, in assert_all_finite _assert_all_finite(X.data if sp.issparse(X) else X) File "/home/bob/.local/lib/python3.6/site-packages/sklearn/utils/validation.py", line 44, in _assert_all_finite " or a value too large for %r." % X.dtype) ValueError: Input contains NaN, infinity or a value too large for dtype('float32').

zoogzog commented 6 years ago

That is most likely the problem with the Cuda 9.1 library. The existing code was tested with Python 3.5 and Cuda 8