Some problem when running test.py and train.py

wuhuikai / FastFCN

FastFCN: Rethinking Dilated Convolution in the Backbone for Semantic Segmentation.

http://wuhuikai.me/FastFCNProject

Other

838 stars 148 forks source link

Some problem when running test.py and train.py #58

Closed pp00704831 closed 4 years ago

pp00704831 commented 4 years ago

Hi, I am a beginner in deep learning. Some problem occurred when I was running the code. First, I use the command 「 tar -xvf encnet_jpu_res50_pcontext.pth.tar 」 to extract the tar file, but it fails. Second, if i successfully extract the file and get checkpoint, which file should I put my checkpoint in ? Where should I extract my checkpoint file to? Thank You!

wuhuikai commented 4 years ago

It's not a tar file. Directly put the file into a folder and use it with --resume.

pp00704831 commented 4 years ago

Thanks you, so where should I put my file ? You means that when I successfully put my file, and use --resume encnet_jpu_res50_pcontext.pth.tar then I can run the test.py ? 我也會說中文，我們也可以用中文對談，謝謝你。

pp00704831 commented 4 years ago

Because I did not see this problem pages before, so maybe I should find the answers here, thank you for your quickly reply.

wuhuikai commented 4 years ago

You're right, using --resume to run test.py.

pp00704831 commented 4 years ago

Hello, I met this problem: /home/tsai/anaconda3/envs/python3.5/lib/python3.5/site-packages/torch/nn/functional.py:2351: UserWarning: nn.functional.upsample is deprecated. Use nn.functional.interpolate instead. warnings.warn("nn.functional.upsample is deprecated. Use nn.functional.interpolate instead.") Segmentation fault (core dumped)

I used the environment python=3.5 , pytorch1.0 , cuda=9.0 Do I need to change nn.functional.upsample into nn.functional.interpolate ? I think that I should not change the original better. Thank you

wuhuikai commented 4 years ago

What's your OS? The code is only tested on Ubuntu 16.04 LTS.

pp00704831 commented 4 years ago

I used Ubuntu 18.04, so It can't run train.py and test.py on the Ubuntu 18.04 system?

wuhuikai commented 4 years ago

I'm not sure about this.

pp00704831 commented 4 years ago

Okay, I will first change the python 3.5 to python 3.6 first, sorry for bothering you so many times.

wuhuikai commented 4 years ago

Is the dataset well-prepared?

On 25 Dec 2019, at 11:03 PM, pp00704831 notifications@github.com wrote:

Hello, I have changed my OS on Ubuntu 16.04, python=3.5, pytorch=1.0, cuda=9.0. I used this to run codes: python test.py --dataset pcontext --model encnet --jpu --aux --se-loss --backbone resnet50 --resume encnet_jpu_res50_pcontext.pth.tar --split val --mode testval But an error happened: ValueError: batch_size should be a positive integeral value, but got batch_size=0

I found that if i did not set any batch_size, it will be batch_size = 16 automatically , but the error still happened. Thank you !

— You are receiving this because you modified the open/close state. Reply to this email directly, view it on GitHub https://github.com/wuhuikai/FastFCN/issues/58?email_source=notifications&email_token=ABULPVL65HNOCKPHMSLBMK3Q2NY2PA5CNFSM4J66ESO2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEHUNOPA#issuecomment-568907580, or unsubscribe https://github.com/notifications/unsubscribe-auth/ABULPVJ6HGSZ3KKZ3WAD7JLQ2NY2PANCNFSM4J66ESOQ.

pp00704831 commented 4 years ago

Well... after I reinstalled the datasets , it could work smoothly. Thank you very much.

pp00704831 commented 4 years ago

Hello, I want to retrain on the model, but I want to use the checkpoint you have gave for testing ' encnet_jpu_res50_pcontext.pth.tar ' So could I use the command like that for more training based on these parameters? python train.py --dataset pcontext \ --model encnet --jpu --aux --se-loss \ --backbone resnet50 --checkname encnet_res50_pcontext ----resume encnet_jpu_res50_pcontext.pth.tar

Because I want to do some simple experiment I don't want to train too many epochs Thanks you!

wuhuikai commented 4 years ago

Please use --ft and use > 1 GPU.

pp00704831 commented 4 years ago

Thanks you very much! I will try that.

pp00704831 commented 4 years ago

Sorry to bother you again. Because I want to do some experiment on the loss function, I added other loss function on it , but when I import the function code, it seems that it can not directly import other function. I added the code in customize.py . Selection_003 My question is that whether we need to do some changes on setup.py ? Because I think the error might come from that. Sorry to bother you so many times, I very appreciate your assistance.

pp00704831 commented 4 years ago

I have successfully modified it, but I still feel strange why we can not import the code from other ? I modified it by writing them in the same code rather than call(import) from other code. Thanks you very much!

wuhuikai commented 4 years ago

There're 2 ways:

If you install by python setup.py install, you need to uninstall and reinstall it after modification.
Installing by python setup.py develop

pp00704831 commented 4 years ago

I got it! Thanks you.