preprocessing: file name parsing fails with "_t"

fjug / MoMA

MoMA - the MotherMachine Analyzer

4 stars 7 forks source link

preprocessing: file name parsing fails with "_t" #29

Closed guiwitz closed 7 years ago

guiwitz commented 8 years ago

I followed all the instructions to get MoMA and the preprocessing working, but I'm stuck with a bug that I'm not able to solve. When I run the preprocessing using the moma_preprocess script on the official dataset in the folder called "MoMA_preproc_example", I get the following error message:

Caused by: java.lang.IllegalArgumentException: ERROR File list corrupt. Time could not be extracted for file /Users/guillaume/Desktop/PostdocBasel/MoMA_test/MoMA_preproc_example/20150624-lac-2-MMStack-Pos0-preproc_t0001_c0001.tif. at com.jug.mmpreprocess.MMDataSource.isInDataRange(MMDataSource.java:82) at com.jug.mmpreprocess.MMDataSource.(MMDataSource.java:47) at com.jug.mmpreprocess.MMPreprocess.main(MMPreprocess.java:52) ... 5 more

I reproduced the bug with other datasets with other names, but I can't find a solution. The file name parser seems to have a problem...

fjug commented 8 years ago

Hi guiwitz,

could you let me know how you are trying to call mmpreprocess? Potentially it might also help to see the content of the folder you are trying to preprocess...

Thanks, Florian

julou commented 8 years ago

Let me introduce Guillaume (@guiwitz) who is a senior postdoc in our lab and very kindly volunteered to be the first guinea pig installing MOMA rather than using the already deployed version we've here. Thanks Guillaume!

In fact I spent some time with him on this today: we checked that I could not reproduce the problem on my laptop (same java version, same md5 of mmpreprocess.jar, etc)

I finally got my eureka moment, remembering that the file name parsing was a bit touchy… in fact it's looking for the first occurence of "_t" in the full path(!) Indeed Guillaume was using a path with ~/MOMA_test/... in it. Removing the underscore fixed the bug.

In fact mmpreprocess should look for the last occurrence of "_t". Or what do you think? btw the way if you touch this parsing code, it would make sens to parse "_c" as well and count the number of channels (so that it doesn't has to be set manually).

fjug commented 8 years ago

Thanks for details and labeling as bug. I will eventually change the code such that it looks for the last occurrence of '_t' that is followed by a number of digits. I guess that might turn out most robust, what do you think?

julou commented 8 years ago

as long as we don't use stacks ;) it's probably the most sensible fix.

just as parsing the number of channels in addition to the time series would be very sensible!…

fjug commented 8 years ago

You have this way of saying things... it always makes me want to let everything else be and work for your right away... ;)