CompImg / LST-AI

LST-AI - Deep Learning Ensemble for Accurate MS Lesion Segmentation
https://doi.org/10.1016/j.nicl.2024.103611
MIT License
20 stars 4 forks source link

Inconsistent Results When Running the Same Example Multiple Times #23

Closed dagarcial closed 1 month ago

dagarcial commented 1 month ago

Issue Summary: I’ve noticed that running the same example multiple times produces different results.

Environment Details:

Steps to Reproduce:

Observed Behavior:

Questions:

Any guidance on how to resolve this issue would be greatly appreciated. Thank you for your help!

twiltgen commented 1 month ago

Hi @dagarcial :

Thank you very much for bringing this issue to our attention.

If possible, could you also share the results for the total lesion volume and total lesion number (not annotated)?

Deterministic Nature of LST-AI:

Although we haven't recently verified this (I can't recall if I did this in the past), there is definitely some variation introduced by the registration process, particularly in the deformable registration for lesion annotation.

Upon reviewing the two runs, I mostly notice differences in the lesion count and slight variations in the total lesion volume. We suspect that the variability in lesion numbers arises because LST-AI tends to segment even very small lesions, which we do not threshold or remove -- something we might consider implementing in the future.

This could explain the differences in output, as the registration process is not entirely deterministic in this context.

In the near future, we will also check if providing a seed for tensorflow and other methods will have some impact on reproducibility of the output.

Does this help?

dagarcial commented 1 month ago

Hi @twiltgen,

Thank you so much for your detailed explanation.

Regarding the registration, it is possible to mitigate the stochastic behavior while still ensuring good results?.

About the total lesion volume, I think it is the information in lesion_stats.csv, isn't?. Here is the data for the cases I shared before. UKbiobank case:

Thanks again for your assistance!

twiltgen commented 1 month ago

Hi @dagarcial

thank you very much for posting the total lesion volume results!

Regarding registration, the implemented tool consistently provided good registration results and is rather fast in terms of processing time. Mitigating the stochastic behavior would inevitably lead to different results in the registration process and, therefore, also to slightly different lesion segmentation results.

But the idea of having a purely deterministic tool is definitely interesting, and we will follow up on that (requires some time and a systematic approach, of course :) ). Thanks again for highlighting this aspect and if you consider trying different approaches yourself, we would be very grateful for any feedback.

dagarcial commented 1 month ago

Hi @twiltgen

Thanks, I think we can close the issue for now.

In case I have some updates, I will reopen the issue to share it with you.

Thanks again.

jqmcginnis commented 1 month ago

Hi @dagarcial @twiltgen - this bothered me a bit, and I quickly tried to assess the affect of fixing the seeds.

For this, I made two code changes:

https://github.com/CompImg/LST-AI/blob/2429c5561a554da94a360f6847d9d74e06e645ea/LST_AI/lst#L20-L21

and

https://github.com/CompImg/LST-AI/blob/2429c5561a554da94a360f6847d9d74e06e645ea/LST_AI/lst_config.py#L6-L10

I briefly checked the effect of seeding for two patients and three runs each in the stats files, I do not see any relevant change in terms of being more consistent.

@dagarcial I believe I correctly seeded this, or do you see any flaw in the logic (I believe seeding should be sufficient if we do it once before loading the other modules, correct?).

Anyhow, here are the results I obtained for the stats files.

Table 1: Lesion Statistics | File Name | Num_Lesions | Num_Vox | Lesion_Volume | |----------------------------------------|-------------|---------|---------------------| | Subject 1, Test 1, No Seeding | 49 | 10502 | 3812.5158807635307 | | Subject 1, Test 2, No Seeding | 45 | 10648 | 3865.517910718918 | | Subject 1, Test 3, No Seeding | 49 | 10354 | 3758.7877956032753 | | Subject 1, Test 1, Seeding | 45 | 10488 | 3807.4334943294525 | | Subject 1, Test 2, Seeding | 47 | 10347 | 3756.246602386236 | | Subject 1, Test 3, Seeding | 44 | 10553 | 3831.030288487673 | | Subject 2, Test 1, No Seeding | 4 | 348 | 146.8125 | | Subject 2, Test 2, No Seeding | 5 | 343 | 144.703125 | | Subject 2, Test 3, No Seeding | 4 | 369 | 155.671875 | | Subject 2, Test 1, Seeding | 5 | 355 | 149.765625 | | Subject 2, Test 2, Seeding | 4 | 367 | 154.828125 | | Subject 2, Test 3, Seeding | 5 | 366 | 154.40625 |
Table 2: Annotated Lesion Statistics | File Name | Region | Num_Lesions | Num_Vox | Lesion_Volume | |----------------------------------------|----------------|-------------|---------|---------------------| | Subject 1, Test 1, No Seeding | Periventricular| 28 | 8656 | 3142.3669266700745 | | Subject 1, Test 1, No Seeding | Juxtacortical | 12 | 1072 | 389.1655898094177 | | Subject 1, Test 1, No Seeding | Subcortical | 9 | 774 | 280.98336428403854 | | Subject 1, Test 1, No Seeding | Infratentorial | 0 | 0 | 0.0 | | Subject 1, Test 2, No Seeding | Periventricular| 23 | 9111 | 3307.5444857776165 | | Subject 1, Test 2, No Seeding | Juxtacortical | 12 | 1056 | 383.3571481704712 | | Subject 1, Test 2, No Seeding | Subcortical | 10 | 481 | 174.61627677083015 | | Subject 1, Test 2, No Seeding | Infratentorial | 0 | 0 | 0.0 | | Subject 1, Test 3, No Seeding | Periventricular| 21 | 7305 | 2651.9166357815266 | | Subject 1, Test 3, No Seeding | Juxtacortical | 12 | 1865 | 677.0464785397053 | | Subject 1, Test 3, No Seeding | Subcortical | 13 | 1177 | 427.28348806500435 | | Subject 1, Test 3, No Seeding | Infratentorial | 3 | 7 | 2.5411932170391083 | | Subject 1, Test 1, Seeding | Periventricular| 23 | 8983 | 3261.0769526660442 | | Subject 1, Test 1, Seeding | Juxtacortical | 14 | 1076 | 390.61770021915436 | | Subject 1, Test 1, Seeding | Subcortical | 8 | 429 | 155.73884144425392 | | Subject 1, Test 1, Seeding | Infratentorial | 0 | 0 | 0.0 | | Subject 1, Test 2, Seeding | Periventricular| 19 | 7307 | 2652.642690986395 | | Subject 1, Test 2, Seeding | Juxtacortical | 13 | 1874 | 680.3137269616127 | | Subject 1, Test 2, Seeding | Subcortical | 13 | 1156 | 419.659908413887 | | Subject 1, Test 2, Seeding | Infratentorial | 2 | 10 | 3.6302760243415833 | | Subject 1, Test 3, Seeding | Periventricular| 23 | 9005 | 3269.0635599195957 | | Subject 1, Test 3, Seeding | Juxtacortical | 12 | 1056 | 383.3571481704712 | | Subject 1, Test 3, Seeding | Subcortical | 9 | 492 | 178.6095803976059 | | Subject 1, Test 3, Seeding | Infratentorial | 0 | 0 | 0.0 | | Subject 2, Test 1, No Seeding | Periventricular| 1 | 148 | 62.4375 | | Subject 2, Test 1, No Seeding | Juxtacortical | 2 | 61 | 25.734375 | | Subject 2, Test 1, No Seeding | Subcortical | 1 | 139 | 58.640625 | | Subject 2, Test 1, No Seeding | Infratentorial | 0 | 0 | 0.0 | | Subject 2, Test 2, No Seeding | Periventricular| 1 | 147 | 62.015625 | | Subject 2, Test 2, No Seeding | Juxtacortical | 2 | 63 | 26.578125 | | Subject 2, Test 2, No Seeding | Subcortical | 1 | 132 | 55.6875 | | Subject 2, Test 2, No Seeding | Infratentorial | 1 | 1 | 0.421875 | | Subject 2, Test 3, No Seeding | Periventricular| 2 | 178 | 75.09375 | | Subject 2, Test 3, No Seeding | Juxtacortical | 1 | 49 | 20.671875 | | Subject 2, Test 3, No Seeding | Subcortical | 1 | 142 | 59.90625 | | Subject 2, Test 3, No Seeding | Infratentorial | 0 | 0 | 0.0 | | Subject 2, Test 1, Seeding | Periventricular| 1 | 144 | 60.75 | | Subject 2, Test 1, Seeding | Juxtacortical | 2 | 68 | 28.6875 | | Subject 2, Test 1, Seeding | Subcortical | 1 | 142 | 59.90625 | | Subject 2, Test 1, Seeding | Infratentorial | 1 | 1 | 0.421875 | | Subject 2, Test 2, Seeding | Periventricular| 1 | 153 | 64.546875 | | Subject 2, Test 2, Seeding | Juxtacortical | 2 | 74 | 31.21875 | | Subject 2, Test 2, Seeding | Subcortical | 1 | 140 | 59.0625 | | Subject 2, Test 2, Seeding | Infratentorial | 0 | 0 | 0.0 | | Subject 2, Test 3, Seeding | Periventricular| 1 | 145 | 61.171875 | | Subject 2, Test 3, Seeding | Juxtacortical | 2 | 74 | 31.21875 | | Subject 2, Test 3, Seeding | Subcortical | 1 | 146 | 61.59375 | | Subject 2, Test 3, Seeding | Infratentorial | 1 | 1 | 0.421875 |
dagarcial commented 1 month ago

Hi @jqmcginnis,

I agree that your approach to seeding appears to be correct.

However, the lack of observed consistency might be related to the registration process, as @twiltgen mentioned. To illustrate this, I conducted a simple test using the Greedy tool, which, as I understand, is utilized for image registration in LST-AI. Despite using the same input brain and template, the registration process produced slightly different output images.

These variations in the registered images could lead to differences in the input provided to LST-AI's segmentation process. As a result, this might cause variations in the segmented lesion zones or, at the very least, in the estimated lesion sizes.