Error in scripts_evalresults/postprocess_predictions.py when using _0 or _1 as part of a filename.

id-b3 commented 3 years ago

There is a crash in the script then the input nifti image has a filename like this: 000000_0 or 000000_1 or 115236_0 etc. Here is the error code:

Input: 'probmaps_proc-01.nii.gz'...  
Input dims : '(406, 309, 424)'...  
Assigned to Reference file: '114647_0.nii.gz'...  
Input data to Network were cropped -> Extend prediction to full-size image...  
Input data were cropped to bounding-box: '((29, 435), (93, 402), (49, 473))'...  
Extend input prediction with bounding-box: '((29, 435), (93, 402), (49, 473))'...  
Input data to Network were masked to ROI (lungs) -> Reverse mask in predictions...  
/bronchinet/src/common/functionutil.py:205: FutureWarning: Possible nested set at position 1  
  sre_substring_filename = re.search(pattern_search, filename)  
In FILE '/bronchinet/src/common/functionutil.py' and LINE '233':  
ERROR: Cannot find a substring with pattern ('[[0-9]+-9]+_0') in the file '114647_0.nii.gz'... EXIT

Changing the file end to words avoids the issue, e.g. 000000_zero or 000000_one however this is not idear when the filenames are generated using indexing.

antonioguj commented 3 years ago

Dear user,

I'm sorry for this issue. This is due to a function 'get_regex_file_pattern' used to obtain automatically the 'regex (regular expression)' file pattern of your input files, which is needed later in the code. For some reason, it assigns the pattern '[[0-9]+-9]+_0', which is clearly wrong.

I will look into this. In the meantime, you can solve this issue by assigning to the variable 'pattern_search' up above in the same script the correct 'regex' pattern directly, which is in your case: pattern_search = '[0-9]+_0' (in string format).

If your input filenames can contain a 1_digit suffix other than '0', you need to assign: patternsearch = '[0-9]+[0-9]'.

You will also need to do the same modification in other scripts within the pipeline to compute the airway predictions, namely in 'process_airway_tree.py' and 'compute_result_metrics.py'.

Please let me know if this solves your issues. Thanks

antonioguj commented 3 years ago

Dear user,

I've looked at this issue. It's due to a sneaky bug in the way I implemented the function "get_regex_file_pattern", and I need to figure out an alternative implementation. In the meantime, you can easily overcome this issue by using min. 2 digits in each component of your filenames, i.e. '000000_01' and '115236_00' instead of '000000_1' and '115236_0'. With this solution, you can ignore the 'workaround' that I proposed above.

Please let me know if this a solution for you. Thanks

antonioguj / bronchinet

Error in scripts_evalresults/postprocess_predictions.py when using _0 or _1 as part of a filename. #15