Issue with sys.argv - Githubissues

NOEL-MNI / deepFCD

Automated Detection of Focal Cortical Dysplasia using Deep Learning

https://noel.bic.mni.mcgill.ca/projects/

BSD 3-Clause "New" or "Revised" License

10 stars 3 forks source link

Issue with sys.argv #1

Open creativedoctor opened 2 years ago

creativedoctor commented 2 years ago

Hello there. Thanks for putting this out there first of all. I am attempting at setting this up at my workplace (without docker), though I have been having issues with the sys.argv calls. Namely, right at the beginning of the inference.py on the GPU call (easily circumvented by replacing the call for 'cpu') though this keeps going for line 45 of the inference.py (and I suspect, of the ones following it as well). Using Jupyter lab I believe it is because I do not have an output for sys.argv[3] or over and your script calls for [3], [4] and [5] at some point. Would you have any ideas on how to deal with this? Thanks.

ravnoor commented 2 years ago

Hey João!

Thank you for taking the time to test this!

I've added a demo jupyter notebook (app/inference.demo.ipynb) detailing the end-to-end analysis for FCD detection. This should help you to get the detection up and running without fiddling with the sys.argv calls in inference.py. Make sure to retrieve the latest version of the main branch though.

In the latest version, I've also clustered all the sys.argv calls towards the beginning of the script, so porting any future iterations to a notebook should be trivial. Thanks for pointing it out!

Also, I wouldn't recommend running the analysis using CPU (not sure if that's the case for you). Using a cheap GPU would easily net you a 10-20x speedup. For reference, the current notebook (for a single patient) takes 50 minutes to execute on a TITAN RTX.

Let me know how it goes. I'll leave the issue open for your feedback and questions.

Best, Ravnoor

creativedoctor commented 2 years ago

Hi Ravnoor,

Thanks for your update. Yes, so I have been messing around with this since yesterday and in the end I found out the directory structure and all and got it running...just for it to crash on me after the process being killed by too little memory in a linux VM in my 6-core macbook pro (I thought since it runs freesurfer, probably could handle this). Currently looking into cloud computing with a gpu for testing.

Also, I see you have Training[TODO] in your readme file. Does that mean there's a training component for others to use as well (e.g. site or scanner-specific)?

ravnoor commented 2 years ago

The documentation was clearly lacking, but I'm glad you got it to work! I have updated theREADME to indicate the expected documentation structure. It's not BIDS compliant, but that's probably a future upgrade.

I haven't profiled the RAM usage, but it could be the reason for the crash. You could try reducing the variables options['batch_size'] = 350000 and options['mini_batch_size'] = 2048 and monitor RAM usage and see if it helps.

Alternatively, you could use Google Colab to test the notebook with a GPU. It has a free-tier with time-restrictions and a Pro version that's reasonably priced.

Yes, we're planning to release a train.py for anyone to use on their own data. The patch-based data based off the 9 sites in the Neurology article is already open-source. Essentially, anyone could use their own data (a fraction of) and/or our patch-based data to train a new model that's tuned to their specific site/scanner.

creativedoctor commented 2 years ago

I lowered batches as low as 5 (five) and mini_batch as low as 2 (two), and the script stills get killed in my macbook. I understand it should take a long time but tuning it down so low wouldn't you expect it to keep running?

NOTE: I m still running the version before you updated the other day. Would that cause some issue you corrected in the meantime?

ravnoor commented 2 years ago

Unless it exits with error, lowering the batch parameters should be fine.

The older versions should work just fine.

What amount of RAM and logical cores are allocated to the VM? What Linux (and version) are you using? I can try simulating your environment to replicate the issue.

creativedoctor commented 2 years ago

I am running CentOS 8.4 on a VM with 4 processors and 10GB RAM allocated in a MB Pro 2019 6-core 16GB RAM.

ravnoor commented 2 years ago

I haven't tested this on CentOS. I won't be able to spin up a working virtual CentOS installation to help diagnose your issue before the end of next week.

The easiest solution would be to use the docker version. I can show you how to access the bash terminal without actually running inference.

The next best thing would be to use an Ubuntu 18.04/20.04 LTS VM or baremetal system. These are the two versions tested to work with deepFCD. Besides, I have access to these systems to help troubleshoot any issues.