Open kbergin opened 5 years ago
Depending on the output for the spike this may have a dependency on #414
@malloryfreeberg, @barkasn already mentioned this to me at the F2F but when you're drafting the RFC, if it looks like there may be significant ingest work could you dial in myself and @aaclan-ebi in as soon as makes sense? I'd like to have the heads up sooner rather than later.
@justincc sure thing. I'm working on this in the current sprint, so I'll reach out soon as it's necessary.
Is there a design document or RFC that can be linked into this Spike to help track progress and state?
@malloryfreeberg's library entity RFC here which overlaps. I believe these RFCs will get reconciled once @barkasn is back.
Mallory's RFC is in community review:
Nick's RFC is in community review
Per the July 18 Refinement meeting, the Milestone needs to be updated to reflect when the RFCs will be reconciled and approved.
Updating to Milestone 2. @morrisonnorman and @kbergin - please correct if you disagree.
Discussed during the August 15 Refinement meeting - there are multiple problems with this issue:
There are four owners but no single owner who is driving this to completion. @diekhans has volunteered to update and drive this spike until @morrisonnorman returns on August 25. In general, the preference is for Product Owners to own and drive issues.
There have been no regular updates about status of either RFC. It is believed that Nick's RFC has been withdrawn and will be resubmitted for community review. @jkaneria and @barkasn - please comment. I've only seen the original announcement for community review. The RFC indicates TBD Last Call for Community Review Mallory's RFC is completing Oversight review today. I would strongly recommend that this type of information be maintained in the top-level summary comment of this issue, so reviewers do not need to scroll for status.
As a result, this now slips from Milestone 2 to Milestone 3.
RFC: Processing Datasets that Span Multiple Data Collection Runs #88 https://github.com/HumanCellAtlas/dcp-community/pull/88
has been significantly updated with the Last Call for Community Review of Aug 27th.
related RFC is tech arch approved: https://github.com/HumanCellAtlas/dcp-community/pull/87
Spike currently blocked by: *[Processing Datasets that Span Multiple Data Collection Runs RFC] (https://github.com/HumanCellAtlas/dcp-community/pull/88) RFC is in oversight review until September 27
This ticket represents the work to spike on implementation for this theme. An RFC will be the end result.
For the theme: User Story A consumer can download a uniformly processed matrix file which reflects all the data processing results from one sequencing library such as all sequencing lane replicates for a high coverage 10x experiment being processed together
Demoable Criteria Process a dataset with a multi lane/machine sequencing strategy for a single library and confirm our results are comparable to the outputs produced by the submitting lab.
Success Metric We can detect and correctly process data sets where the libraries are found in multiple sequencing lanes.