tbepler / topaz

Pipeline for particle picking in cryo-electron microscopy images using convolutional neural networks trained from positive and unlabeled examples. Also featuring micrograph and tomogram denoising with DNNs.
GNU General Public License v3.0
169 stars 63 forks source link

Add ETA indicator to topaz preprocess? #37

Closed olibclarke closed 5 years ago

olibclarke commented 5 years ago

Hi,

When running topaz_preprocess to downsample and normalize, it would be nice if it gave an indication of how long it was going to take. I am running it on a large (3300 mic) dataset right now, and with 24 workers it has already taken ~30min, with no indication (apart from running processes) that it is doing anything - no mics written and nothing written to stdout. Is it normal that it does not output the processed micrographs until the end?

Cheers Oli

olibclarke commented 5 years ago

Hmm - it completed, but all the micrographs it has written are full of NaNs. the command I used was:

topaz preprocess -s 8 -o processed/micrographs/ J89/motioncorrected/*doseweighted.mrc

I have run this on other micrographs processedin the same way without issues - am I doing something obviously dense?

image
tbepler commented 5 years ago

You can turn on progress notifcation with the -v flag.

That's a weird error. I haven't seen NaNs pop up in micrographs after preprocessing before. Are you still on v0.2.1 of topaz?

olibclarke commented 5 years ago

Yep, on 0.2.1. Interestingly, if I process a single micrograph it looks normal (see attached) - it is only when processing all at once that they are all filled with NaNs.

image
tbepler commented 5 years ago

Interesting, you may be in some sort of bad corner case for the parameter estimation algorithm. Does it require the whole dataset to cause the error or is there some sort of minimal subset you could share with me for debugging?

That said, I changed the algorithm in the current version of topaz on github which will likely fix the problem. It hasn't been released as a stable version on conda, etc., yet, but you can install it by pulling the most recent github release. It allows micrographs to be processed in a stream and for preprocessing to be GPU accelerated as well.

olibclarke commented 5 years ago

Not sure - I'll take a look and see if I can find a smaller subset that shows the issue, so far I've only tried the whole dataset or one micrograph

tbepler commented 5 years ago

It would probably be faster to just switch to the most recent topaz version here. Also, if the per micrograph run looks fine, then I would just go with those.

olibclarke commented 5 years ago

you mean just running topaz preprocess in a for loop across all the mics? I can do that. I'll try the dev branch, too

tbepler commented 5 years ago

Yeah, that's what I meant.

Since this seems to a bug in the old algorithm, I'm going to close this issue. If it does come up in the dev version or you have any other questions about it please open a new one.