Re-run segmentation example experiments

danielsf commented 3 years ago

Once #294 is done and we are storing ROIs directly in the HDF5 output file, we should re-run our 18 example experiments (or the data cube delivered by the science team, if that is available at this time) to both

See how results have changed due to the algorithmic work that has been done over the last month
Get our results into the new HDF5 format

For reference, the ophys_experiment_ids of the example experiments we have been using are

experiment_ids=(785569470
785569447
788422859
795012008
795011996
788422825
795897800
795901895
795901850
806862946
803965468
806928824
951980473
1048483611
951980484
1048483613
1048483616
850517348)

Tasks

[ ] Run detection and merging for the 18 experiments listed above
[ ] Generate some kind of visualization, either in the notebook or out of the notebook (if the notebook has not been updated to read the HDF5 file a la #296), summarizing results

Optional

[ ] Run a filtering step, either between detection and merging or after merging (depending on the recommendation coming out of #289)

Validation

[ ] Visualizations have been generated and shared with team Pika

djkapner commented 3 years ago

copy of email to Pika:

Hi All,

I wanted to document some steps to take to look at the latest results from segmentation.

The notebook for inspection can be found here. And a conda environment specification for using it is here.
I have run our 18 test experiments through the current detection phase (no filtering, no merging yet). I suspect that Scott has settings for some window_size parameters for inhibitory lines, but I didn't know the exact values - we can replace these next week when I know. For now, everything is done with default values of the latest segmentation branch. I used the slurm script copied at the end of this message to perform the segmentation. (Actually 1 experiment timed out, that will be done in a couple of hours).
The 2nd cell in the notebook should define:

sqlite_path = Path("/allen/aibs/informatics/danielk/djk_sep01.db") This particular sqlite DB file has reduced the number of options for
backgrounds (denoised max and avg projections, correlation metric)
segmentations (legacy - from lims, the new detect phase, described above)
videos (motion corrected and denoised)
As a reminder, the notebook lets you customize views for comparing ROIs with different background images, example below. I'll be using this part of the notebook to look for any glaring problems. I will be adding them to this pervious set of observations: http://confluence.corp.alleninstitute.org/display/IT/prototype+segmentation+observations
I will also add in a filter step after the detect phase, which will appear as an additional dataset option when ready.

If you have any trouble running this notebook, let me know so I can fix it.

Dan

djkapner commented 3 years ago

And the slurm script

#!/bin/bash
#SBATCH --job-name=ophys-segmentation
#SBATCH --mail-type=NONE
#SBATCH --ntasks=24
#SBATCH --mem=140gb
#SBATCH --time=04:00:00
#SBATCH --output=/allen/aibs/informatics/danielk/deepinterpolation/logs/segmentation_%A-%a.log
#SBATCH --partition braintv
#SBATCH --array=12

export TMPDIR=/scratch/fast/${SLURM_JOB_ID}
image=docker://alleninstitutepika/ophys_etl_pipelines:14d0157a589fc3b4e5055fed63709a783070b6bf

eids=(
785569470
785569447
788422859
795012008
795011996
788422825
795897800
795901895
795901850
806862946
803965468
806928824
951980484
1048483611
1048483613
951980473
1048483616
850517348
)

expdir=/allen/programs/braintv/workgroups/nc-ophys/danielk/deepinterpolation/experiments/ophys_experiment_${eids[$SLURM_ARRAY_TASK_ID]}
video=${expdir}/videos/deep_denoised.h5
graph=${expdir}/backgrounds/deep_denoised_filtered_hnc_Gaussian_graph.pkl
logpath=${expdir}/djk_sep01_assessment.h5

SINGULARITY_TMPDIR=${TMPDIR} singularity run \
    --bind /allen:/allen,${TMPDIR}:/tmp \
    ${image} \
        /envs/ophys_etl/bin/python -m ophys_etl.modules.segmentation.modules.feature_vector_segmentation \
        --graph_input ${graph} \
        --video_input ${video} \
        --log_path ${logpath} \
        --n_parallel_workers 24 \
        --seeder_args.keep_fraction 0.1 \
        --seeder_args.n_samples 24

djkapner commented 3 years ago

for the filter step, I used the same slurm script as above, but added:

SINGULARITY_TMPDIR=${TMPDIR} singularity run \
    --bind /allen:/allen,${TMPDIR}:/tmp \
    ${image} \
        /envs/ophys_etl/bin/python -m ophys_etl.modules.segmentation.modules.filter_z_score \
        --min_z 2.0 \
        --graph_input ${graph} \
        --pipeline_stage post_detect_filter \
        --log_path ${logpath}

djkapner commented 3 years ago

This sqlite file now has the detect and filters steps for all 18 experiments:

sqlite_path = Path("/allen/aibs/informatics/segmentation_eval_dbs/djk_sep01.db")

djkapner commented 3 years ago

List of experiments and notes. Comparing LIMS legacy valid and current state of detect + zscore filter valid.

Deepscope SLC

New one clearly an improvement over legacy. Will need classifier to clean some up.

785569470 - legacy found only one. New one does better, finds extras.
785569447 - New finds more obvious ones, more extras that will need to be classified away.
788422859 - poster child for the new method. Much improvement over legacy.
795012008 - New does quite well. Need to add metric png background to DB.
795011996 - same comment as above.
788422825 - improvement over legacy. Misses a few fainter ones in metric image, but, overall good.
795897800 - new looks quite good.
795901895 - new looks good.
795901850 - legacy found no valid. New looks good.

3P SLC

806862946 - new looks good. legacy did not find much.
803965468 - new looks good. will need classifier cleanup.
806928824 - new looks good, will need cleanup. legacy found not much.

MESO SST

951980484 - missing new segmentation in DB.
951980473 - legacy struggled. new one found clear cells, also large blobs. Needs science team guidance.
850517348 - same comment as above.

MESO VIP

1048483611 - legacy struggled. new one finds good stuff and a lot of dendritic background.
1048483613 - same comment as above.
1048483616 - same comment as above.

AllenInstitute / ophys_etl_pipelines