sciai-lab / batchlib

Batch procesing for high-throughput screening data
MIT License
3 stars 0 forks source link

Keep track of all analysis pipeline parameter #98

Closed constantinpape closed 4 years ago

constantinpape commented 4 years ago

We should keep track of all parameters that were used to run a given experiment, e.g. QC parameter, thresholds, etc. These should go into:

On the implementation side, I would gather all into a subgroup of the configparser parameter, so that we can easily dump all this into the logs / tables.

constantinpape commented 4 years ago

cc @imagirom @tischi @wolny

constantinpape commented 4 years ago

Will be implemented in #99

constantinpape commented 4 years ago

@wolny once #99 is merged we can also have a look into how we write this to the DB.

constantinpape commented 4 years ago

@wolny I reassigned this to you, because the only thing left is writing this to the db. The analysis params are stored in the table plate/analysis_paramter.

wolny commented 4 years ago

Thanks for info @constantinpape. If this is another table, then it's already stored in the db (db job parses all the tables and adds it to the results document). However we might want to treat this object differently and pull it as a top level attribute in the results document. I'll make a draft PR to discuss this.

wolny commented 4 years ago

Yes, I've double checked: it's already stored in the DB, e.g. for plate plateT3rep1_20200509_152617_891:

{
            "table_name" : "plate/analysis_parameter",
            "results" : [
                {
                    "plate_name" : "plateT3rep1_20200509_152617_891_table.hdf5",
                    "marker_denoise_radius" : 0,
                    "dont_ignore_nuclei" : "False",
                    "infected_detection_threshold" : 6.2,
                    "scale_infected_detection_with_mad" : "True",
                    "qc_cells_max_size_threshold" : 10000,
                    "qc_cells_min_size_threshold" : 1000,
                    "qc_images_max_number_cells" : 1000,
                    "qc_images_min_number_cells" : 10,
                    "qc_wells_max_number_cells_per_image" : 1000,
                    "qc_wells_min_number_cells_per_image" : 10,
                    "qc_wells_min_number_control_cells_per_image" : 5,
                    "qc_wells_min_fraction_of_control_cells" : 0.05,
                    "qc_wells_check_ratios" : "True",
                    "fixed_background" : "True",
                    "background_serum_IgG_corrected" : 1300,
                    "background_serum_IgA_corrected" : 1800,
                    "background_marker_corrected" : "plate/backgrounds"
                }
            ]
        }

however IMO it doesn't belong there and should be saved as a top level attribute of the result object.