ebi-ait / ingest-graph-validator

HCA Ingest Service Graph Validation Suite
MIT License
1 stars 0 forks source link

20200522: Ensure one fastq set per lane index #11

Closed mshadbolt closed 3 years ago

mshadbolt commented 4 years ago

Describe the issue

As a data wrangler, I would like to test the integrity of the dataset I'm wrangling by ensuring that not more than one set of fastq files is assigned to one lane index. This could extend the ensure_lane_index test or be created as a new test.

Specifically I would like the test to ensure that only one read1, read2, index1, index2 occurs within one lane index of one library prep.

Test template

Acceptance criteria

ESapenaVentura commented 3 years ago

https://github.com/ebi-ait/ingest-graph-validator/pull/36

I have done something similar that I think also solves #31 and #30

The query concatenates process_id, lane_index and read_index, creating a combination that should always be unique.

Then counts how many distinct files within each of these combinations and returns process_id, number of files and an error message if that number is above 1

ESapenaVentura commented 3 years ago

@ami-day tagging you here because I think this also solves your tickets

ESapenaVentura commented 3 years ago

Will be done soon :)