aramis-lab / clinica

Software platform for clinical neuroimaging studies
http://www.clinica.run/
Other
218 stars 72 forks source link

Pet Pipeline: Non-Selective Reprocessing of Eligible Bids Entries #1079

Open souravraha opened 5 months ago

souravraha commented 5 months ago

Describe the bug The pet pipeline is not intelligently skipping already processed bids entries, unlike the t1 pipeline. This results in the pet pipeline reprocessing all eligible bids entries.

To Reproduce Steps to reproduce the behavior:

  1. Execute the pet pipeline on ADNI data, e.g. using the 18FFV45 tracer with cerebellumPons2 suvr.
  2. Do it again on the same BIDS and CAPS folders.
  3. Notice that it reprocesses all eligible bids entries, regardless of whether they have been processed before.

Expected behavior The pet pipeline should intelligently skip already processed bids entries, similar to the behavior of the t1 pipeline.

Screenshots N/A

Desktop:

Additional context This issue affects the efficiency and performance of the pet pipeline. Implementing intelligent skipping of already processed bids entries will significantly improve its performance and reduce redundant processing.

NicolasGensollen commented 5 months ago

Hi @souravraha

Thanks for pointing this out. I'm not sure to fully understand what you mean though. I haven't looked deeply into it, but as long as you provide a working directory to these pipelines (and use the same working directory the second time you run the pipeline) Nipype should be clever enough to not re-run the computations. If you look at the logs, you should see that it is using cached results. I don't believe Clinica explicitly implements a caching mechanism other than this one (might have to double check that...).

Could you share the commands that you executed ?

souravraha commented 5 months ago

@NicolasGensollen Upon re-running the command:

clinica run pet-linear --save_pet_in_t1w_space -wd /DATA/user/tmp/ bids CAPS 18FAV45 cerebellumPons2 -tsv partial_list.tsv

I noticed that the PET pipeline processes each BIDS subject sequentially, despite having previously executed the pipeline.

In contrast, the T1 pipeline issues a warning indicating that each BIDS subject has already been processed and promptly skips them, resulting in a significantly shorter re-execution time, typically completing within a few seconds.

NicolasGensollen commented 5 months ago

@souravraha I think you're right.

There is some logic implemented in the AnatLinearPipeline which looks for already processed images and skip them:

https://github.com/aramis-lab/clinica/blob/2939b05096a109ddcc2c060c02d55079f55c6ba5/clinica/pipelines/t1_linear/anat_linear_pipeline.py#L136-L153

This is based on the implementation of this method:

https://github.com/aramis-lab/clinica/blob/2939b05096a109ddcc2c060c02d55079f55c6ba5/clinica/pipelines/t1_linear/anat_linear_pipeline.py#L28-L41

Which has an abstract definition in the engine, but is not implemented by all pipelines (for example PET pipelines do not implement this).

What's even more strange is that some pipelines (like DWIPreprocessingUsingT1) seem to implement the method but have no skipping logic when reading input files...

I think we should definitely fix this and offer a similar user experience for all pipelines. I'll add this to my todo list 😅

souravraha commented 5 months ago

@NicolasGensollen

While discussing this matter, I'd like to revisit a previous issue, #1060, which you helped resolve. After incorporating your enhancements and executing the converter on the existing bids directory, I encountered the same errors mentioned in #1060. To resolve this, I delved deeper and identified files with troublesome suffixes ("ADC", "real") within the bids directory. These files stemmed from an earlier problematic version of the converter. Upon removing these older files, I was able to successfully execute the converter with your enhancements.

It appears that any improvements implemented may not be effective until the problematic files are removed from the disk. This issue could potentially be mitigated if the logic initially detected such problematic files. I wanted to bring this to your attention for your awareness.

github-actions[bot] commented 1 month ago

This issue is considered stale because it has not received further activity for the last 14 days. You may remove the inactive label or add a comment, otherwise it will be closed after the next 14 days.