Build needed data model for analysis results

ebrahimebrahim commented 2 years ago

We should have a AnalysisBatch just like we have the concept of PreprocessingBatch, to allow running analysis more than once with different settings, and to allow us to handle errors that arise during analysis runs, and provide the notion of an analysis batch ID so that we know which run it was that generated various artifacts later on.
AnalysisResult: represents the results of an AnalysisBatch that ran successfully. Contains:
- associated AnalysisBatch ID
- link to download the zipped up results
- pointer to list of CorrelationResults. there will be one of these for each variable (e.g. age, CDR, etc.)
CorrelationResult: Has the correlation analysis information for a specific variable. Contains:
- id of associated AnalysisResult
- variable name (e.g. age, CDR, etc.)
- links to nifti images of correlation values (one for allocation, one for transport, and one for vbm)
- link to nifti images of p-values (again one for allocation, one for transport, and one for vbm)
- link to atlas

@AlmightyYakob What do you think of an organization like this? Let's iterate on this from here. Maybe AnalysisResult is redundant and could be merged into AnalysisBatch?

jjnesbitt commented 2 years ago

I think this design can be simplified a bit. Since analysis is treated as one unit (unlike preprocessing), I think we could probably do with just one model, AnalysisResult.

So AnalysisResult would contain the following:

status - Run status (pending/running/finished/failed)
error_message - The error message if it failed.
preprocessing_batch - The preprocessing batch is was run from. Linking it to this batch instead of the dataset will allow for running preprocessing with varying options, and propagating that to the analysis.
zip - Link to the zipped up results
~~atlas - The atlas used~~
data - This is where the result images would live

I was thinking of placing all of the resulting images in the data field, in a JSON/dictionary structure. It would look like the following:

{
    "<variable> (e.g. Age)": {
        "allocation": {
            "correlation": "...",
            "pvalue": "..."
        },
        "transport": {
            "correlation": "...",
            "pvalue": "..."
        },
        "vbm": {
            "correlation": "...",
            "pvalue": "..."
        }
    },
}

This is the structure that made the most sense to me, since the image we display in the UI is the intersection of selected variable and selected analysis. This structure allows for easily indexing by these two values, and retrieval of both correlation and p-value images.

@ebrahimebrahim What do you think?

ebrahimebrahim commented 2 years ago

@AlmightyYakob This makes a lot of sense. I didn't realize you can stick a json/dictionary structure into one of these django data model things. This organization looks great!

jjnesbitt commented 2 years ago

It seems some revision of the preprocessed image models is needed. In short, the following will take place as a part of addressing this issue:

The atlas field on each preprocessed image will be migrated into the associated preprocessing batch, and removed from the images themselves.
For the AnalysisResult model, the atlas isn't actually needed, as it's already stored on the preprocessing batch. We can just point to (likely join) that whenever needed.

KitwareMedical / otm-server

Build needed data model for analysis results #113