sanger / General-Backlog-Items

Broad bucket to collate backlog items that have no obvious repository
0 stars 0 forks source link

ST0006 As a TL I would to check the report consistency, identify and flag inconsistencies so that we can improve confidence in the report. #394

Open TWJW-SANGER opened 6 months ago

TWJW-SANGER commented 6 months ago

As a TL I would to check the report consistency, identify and flag inconsistencies so that we can improve confidence in the report.

Acceptance Criteria

Outputs New stories to identify the underlying cause of any incorrect rows and possible fixes.

sabrine33 commented 5 months ago

Better to look into this after #404 is resolved, as that should update/populate some of the event dates

khelwood commented 5 months ago
khelwood commented 5 months ago
khelwood commented 5 months ago

Total rows: 34117

labware_received is null but other dates are present: 4765

Study name

Latest date in row

Barcode prefix

library_start_first is null but sequencing_run_start_first or sequencing_qc_complete_first is not null: 7049

Study name

Latest date in row

Barcode prefix

sequencing_run_start_first is null but sequencing_qc_complete_first is not null: 426

Study name

Latest date in row

Barcode prefix

TWJW-SANGER commented 5 months ago

Adding some notes of my own.

RE: labware_received is null but other dates are present Looking at the first study that has this problem 1098-PF-ET-GOLASSA gives 2 plates SQPP-52926-B and SQPP-52927-C. Only SQPP-52927-C has NULL for labware_received.

Using the query on confluence to Pull All Event Data For A Piece of Labware for SQPP-52927-C gives only events of type sample_manifest.created, labwhere_create and sample_manifest.updated.

Looking at the MLWH_Events 101 page for 'labware_received' it says When a labware is received via Sequencescape Labwhere Reception. Creating this event in SequenceScape allows it to have a sample subject as SequenceScape knows about samples.

The labwhere_create event_type is created by LabWhere (which doesn't know about samples).

What I think is happening is that sometimes the lab are scanning in labware into SequenceScape's Labwhere reception page and sometimes the lab are scanning in labware straight into the LabWhere application.

We probably need another story to look at finding the first scanned in event between labwhere_create and labware_received events. NB This is complicated by the first CTE in the sample_tracking_vew which is filtering on the sample role that the labwhere_create event will not have....

khelwood commented 5 months ago

@tw6 It's looks to me like the reason for the _sequencing_run_start_first is null but sequencing_qc_completefirst is not null rows is where the sequencing_start event is before the date cutoff for events (two years ago). If that's a problem, we'd have to include older events and join them onto the query.

TWJW-SANGER commented 5 months ago

Ok, an artifact of the cutoff date for that one. I think that is acceptable and we can add a note that this can happen in the view.

TWJW-SANGER commented 5 months ago

Check IRods locations when no sequencing run dates have been created.

khelwood commented 5 months ago

sequencing_run_start_first is null but irods_root_collections is not null: 8975

Study name

Latest date in row

Barcode prefix

khelwood commented 5 months ago

sequencing_qc_complete_first is null but irods_root_collections is not null: 8903

Study name

Latest date in row

Barcode prefix