sanger / sequencescape

Web based LIMS
MIT License
87 stars 33 forks source link

Y24-058 [BUG] Study data release information - data clean up [Due 13/11/24] #4084

Open KatyTaylor opened 6 months ago

KatyTaylor commented 6 months ago

Describe the bug During development of https://github.com/sanger/sequencescape/issues/4076, I found that many of the existing Studies in the Sequencescape database do not conform to the new validation rules. Because of this, I deployed the new server-side validation for that story 'feature flagged off'. This story is to track the resolution of this. Ideally, we can 'fix' all the existing studies, so that they pass the new validation, and then remove the feature flag.

There is an ongoing conversation via email between Katy, Liz C and Governance as to whether the validation rules are correct, and whether we can 'fix' those studies.

Additional context See query to find records that fail the validation here: https://github.com/sanger/sequencescape/issues/4076#issuecomment-2045269998

Table that shows the expected relationships between fields, which the new validation would enforce:

Data release strategy Data release timing
Managed or Open Standard, Immediate or Delayed
Not applicable Never

To remove the feature flag, merge the following pull request: https://github.com/sanger/sequencescape/pull/4086 This will be the first time the server-side validation is enabled, so consider doing some extra testing of it.

Solution: prepare a spreadsheet broken down by faculty. The strategy for each study needs to match the data release timing

We still need to do it for inactive studies? Could we change validation so that it doesn't validate if it is inactive? Maybe they haven't started yet? What if they have samples that have an accession number?

stevieing commented 1 week ago

43 studies to change. Script generated so missing information.

@KatyTaylor Neil has some good knowledge of this so can do some useful work to prepare the information for faculty to review.

neilsycamore commented 4 days ago

Y24-058_studies_imported_from_SNP.csv There are 945 studies that were imported from a previous tracking LIMS (ETS) in Oct 2010. All have 'Automatically imported from SNP, please check & update all information' in the study description. Many of these have new lines or carriage returns in the description. The attached csv has had these removed and makes a useable xlsx worksheet. Next step is to forward this to the Tech admin/SSR team.