Closed olibclarke closed 5 years ago
Hmm - it completed, but all the micrographs it has written are full of NaNs. the command I used was:
topaz preprocess -s 8 -o processed/micrographs/ J89/motioncorrected/*doseweighted.mrc
I have run this on other micrographs processedin the same way without issues - am I doing something obviously dense?
You can turn on progress notifcation with the -v flag.
That's a weird error. I haven't seen NaNs pop up in micrographs after preprocessing before. Are you still on v0.2.1 of topaz?
Yep, on 0.2.1. Interestingly, if I process a single micrograph it looks normal (see attached) - it is only when processing all at once that they are all filled with NaNs.
Interesting, you may be in some sort of bad corner case for the parameter estimation algorithm. Does it require the whole dataset to cause the error or is there some sort of minimal subset you could share with me for debugging?
That said, I changed the algorithm in the current version of topaz on github which will likely fix the problem. It hasn't been released as a stable version on conda, etc., yet, but you can install it by pulling the most recent github release. It allows micrographs to be processed in a stream and for preprocessing to be GPU accelerated as well.
Not sure - I'll take a look and see if I can find a smaller subset that shows the issue, so far I've only tried the whole dataset or one micrograph
It would probably be faster to just switch to the most recent topaz version here. Also, if the per micrograph run looks fine, then I would just go with those.
you mean just running topaz preprocess
in a for loop across all the mics? I can do that. I'll try the dev branch, too
Yeah, that's what I meant.
Since this seems to a bug in the old algorithm, I'm going to close this issue. If it does come up in the dev version or you have any other questions about it please open a new one.
Hi,
When running topaz_preprocess to downsample and normalize, it would be nice if it gave an indication of how long it was going to take. I am running it on a large (3300 mic) dataset right now, and with 24 workers it has already taken ~30min, with no indication (apart from running processes) that it is doing anything - no mics written and nothing written to stdout. Is it normal that it does not output the processed micrographs until the end?
Cheers Oli