Closed abbyQu closed 6 years ago
train or test? please try test first
and try to reduce the batch size
thank you ! I was executing the testing : pyhon main.py
batch_size =1
this is my config_submit config = {'datapath': '/home/qrf/DSB2017-master/test', 'preprocess_result_path': './prep_result/', 'outputfile': 'prediction.csv',
'detector_model': 'net_detector',
'detector_param': './model/detector.ckpt',
'classifier_model': 'net_classifier',
'classifier_param': './model/classifier.ckpt',
'n_gpu':1,
'n_worker_preprocessing':2,
'use_exsiting_preprocessing': False,
'skip_preprocessing': False,
'skip_detect': False}
i traced the memory usage, in the pre-processing,no gpu memory was used . and when it printed "end preprocessing" , the usage increase slowly to 239M ,and suddenly increased to 2G ,an exit with the err "out of memory"
batchsize is controled by console parameter -b, see main.py
2018-05-17 14:44 GMT+08:00 abbyQu notifications@github.com:
i traced the memory usage, in the pre-processing,no gpu memory was used . and when it printed "end preprocessing" , the usage increase slowly to 200M ,and suddenly increased to 2G ,an exit with the err "out of memory"
— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/lfz/DSB2017/issues/78#issuecomment-389762028, or mute the thread https://github.com/notifications/unsubscribe-auth/AIigQxR6R4nq6SKpbmxB7ho2htRPtnNUks5tzRw4gaJpZM4UCTfQ .
-- 廖方舟 清华大学医学院 Liao Fangzhou School of Medicine Tsinghua University Beijing 100084 China
thanks a lot …I haven't find any parameter called "b' or "-b" in the project. i was wondering if the 1060 itself can not Carry such a large amount of calculation. Can you give any clue about the basic requirement of the testing? Does gtx1080 can make it ? Thanks!
thank you!!!
Hi @lfz, My platform is Debian 8, 4 GeForce GTX 1080 GPU, each has a memory about 8G. In addition, I have 12 CPU threads. To run python main.py in root folder, I set test_loader = DataLoader(dataset,batch_size = 1, shuffle = False,num_workers = 10, pin_memory=False,collate_fn =collate)
. The problem occurs again. Could you give any clues to fix the issue? Thanks in advance.
could you give me the entire command and path you use, and the output log
2018-05-28 20:48 GMT+08:00 ccHuang notifications@github.com:
Hi @lfz https://github.com/lfz, My platform is Debian 8, 4 GeForce GTX 1080 GPU, each has a memory about 8G. In addition, I have 12 CPU threads. To run python main.py in root folder, I set test_loader = DataLoader(dataset,batch_size = 1, shuffle = False,num_workers = 10, pin_memory=False,collate_fn =collate). The problem occurs again. Could you give any clues to fix the issue? Thanks in advance.
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/lfz/DSB2017/issues/78#issuecomment-392518308, or mute the thread https://github.com/notifications/unsubscribe-auth/AIigQ5L2xTBFtlv1_S2AET8ukCdXa-kBks5t2_IBgaJpZM4UCTfQ .
-- 廖方舟 清华大学医学院 Liao Fangzhou School of Medicine Tsinghua University Beijing 100084 China
My test data contains only two samples from stage1, here's a screenshot of the root folder and the command.
hi, please try to print the shape of "input" before line 52 of test_detect.py to make sure that the shape is 1x1x128x128x128
if it is, please try to reduce the "sidelen" in line 50 of main.py
Thank you. My input size is not the same:
According to your code, I think the first dimension depends on the GPU numbers. What do other dimensions mean? How to change that to get a size of 1x1x128x128x128?
oh, it's ok, it should be 208, 128 is the cube size when training. When testing, the size is 208
208 = 144+2*32
make sure that all your 4 gpus are used properly, change "ngpu" to 1 to test it
if all your gpus works correctly, try to change the sidelen to a smaller number
It works after reducing sidelen
to 112, but the prediction seems not accurate.
Please run all cases to see the overall score, it might be a bad case for every one.
To be more clear, the final score 0.4 is not a high score, the overall accuracy is just above 80%, which is based on a chance level of 70%
2018-06-06 10:18 GMT+08:00 ccHuang notifications@github.com:
It works after reducing sidelen to 112, but the prediction seems not accurate. [image: image] https://user-images.githubusercontent.com/26662685/41012177-f2b41136-6972-11e8-894e-1a251ff66523.png
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/lfz/DSB2017/issues/78#issuecomment-394918424, or mute the thread https://github.com/notifications/unsubscribe-auth/AIigQ10yQR69-8tmkMJ_guf-HKxkHfiWks5t5zwEgaJpZM4UCTfQ .
-- 廖方舟 清华大学医学院 Liao Fangzhou School of Medicine Tsinghua University Beijing 100084 China
@lfz Hi, I found that in training ,the input size is 128128128, but in testing,the input size is 208208208. Why the train and test size is different? In my knowledge, I think the train and test size should be the same. Thanks in advance.
when doing the testing ,i watched the gpu memory, found that the script exited when the memory usage got 2G. my gpu is 1060 with 6G memory ,how did that come?
System Info
RuntimeError: cuda runtime err(2): out of memory at /opt/conda/conda-bld/pytorch_1501953625411/work/pytorch-0.1.12/torch/libTHC/THCstorage.cu:66
PyTorch or Caffe2: How you installed PyTorch:conda, Build command you used (if compiling from source): OS: Ubuntu 14.04 PyTorch version: pytorch 0.1.10 Python version: 2.7 CUDA/cuDNN version: 8.0/5.1 GCC version (if compiling from source): 4.9