Closed joaomamede closed 3 years ago
I managed to analyze with 1022x1022 chunks following your guide at: https://github.com/hammerlab/flowdec/blob/master/python/examples/notebooks/Tile-by-tile%20deconvolution%20using%20dask.ipynb
I tried to pass tf.ProtoConf options to try to use my non-GPU ram but it did not work. If you think chunck processing is the more appropriate path, I guess I'll do that. Would anyone let me know if there's any other solution? Because I'd like that the code would be the same in all my computers to avoid problems with analysis
Hi @joaomamede,
A few thoughts:
I would be surprised if a float32 11x2044x2048 image can't be deconvolved with 4G and pad_mode='none', but I think doing it without any padding is probably not a good idea if there is a lot of information at the boundaries. You're probably better off going the chunking route so you have some more control over it, unfortunately.
I was forced to have RHEL by my institution and it's a library nightmare. FlowDec works in this computer only with Tensorflow 2.0 (1.14 and 1.15 can't find the correct cuda libs).
I pulled out my other laptop's (ubuntu) SSD and will run it with the libraries that I know that work, in this 4GB machine to see if that might be the source of problem.
Just to be sure, I should use TF 1.14 with Cuda 10.1 correct?
Thanks for your help.
Yep I think 10.1 is the right cuda version for TF 1.14 (and I would use that). If it's an option on your laptop, I'd also suggest installing docker and doing a docker pull tensorflow/tensorflow:1.14.0-gpu-py3
to get a container that would have everything in order for you already (as far as CUDA toolkit installation goes). That's also a much saner way to manage multiple cuda versions if you need them, e.g. if you want to try different TF versions.
conda tensorflow-gpu==1.14 worked with my already installed cuda (10.1) libraries ( the problem was the one from pip). Without any padding options (only flowdec's defaults)
It runs if I run 512,512 chunks, but it I do 1022x1022 (with 0,6,6 overlaps) with :
res1 =arr.map_overlap(
deconv,depth=(0,6,6),
boundary='reflect',
dtype='float32').compute(num_workers=1)
Sometimes it fails, sometimes it doesn't (I guess this rhel8 old gnome version really is GPU mem intensive). Any way to use shared memory with RAM? Would that be slower than 512 by 512 chunks? YacuDecu for example had Device, Stream, Host modes. Any way we can make tensorflow to handle this differently with sending an argument?
Thanks for any help. Finally I will run it in my Lab's quadro RTX with 8GB ram, but I'd like the flexibility to test things in my laptop with the same code.
Here's the whole output from jupyter notebook https://pastebin.com/XQUAkha3
and from the terminal running jupyter https://pastebin.com/41uKpmN8
Note: I hope to make a GUI for FlowDec if I have a bit of time.
Hi, This same code works in my 6GB and 8GB nvidia RTX (personal laptop and microscope computer).
My work laptop is a nvidia Quadro T2000 that only has 4GB of ram:
I run into not enough memory errors (my arrays are 11Z 2044Y and 2048X) in my T2000.
I have tried withouth any padding arguments or [1,1,1] as well.
Then I run through my Nd2 files channels and call the algo with:
This is the error that jupyter notebook spits in my terminal:
Anything I can do?
Thank you for flowdec!