HumanCellAtlas / secondary-analysis

Secondary Analysis Service of the Human Cell Atlas Data Coordination Platform
https://pipelines.data.humancellatlas.org/ui/
BSD 3-Clause "New" or "Revised" License
3 stars 2 forks source link

[spike] Estimate implementation of no-op on receiving bundle updates #606

Closed justincc closed 5 years ago

justincc commented 5 years ago

For the restricted Q2 updates scope, please could you t-shirt the work required to not submit analysis for any assay data metadata updates, where analysis for that assay has been submitted to ingest previously. From pre-RFC.

This supersedes #582 and #583 for this scope. Thanks!

samanehsan commented 5 years ago

@justincc do you mean "t-shirt the work required to not submit analysis for any assay metadata updates"? Data updates would impact analysis, and the epic ticket mentions "Updates which require an analysis pipeline to be run again" as out of scope.

justincc commented 5 years ago

Sorry yes, I meant metadata

justincc commented 5 years ago

So this ticket morphed into telling the wranglers what metadata fields not to update since they would trigger re-analysis which we can't cope with in Q2 functionality. This restriction should be removed in Q3 work. @rexwangcc accountable for this. Here is the draft HCA DCP Pipelines Execution Service Developer-Facing Guide

brianraymor commented 5 years ago

This is missing both a Milestone and Release. Please add since this is blocking a Q2->Q3 Roadmap objective. @justincc and @jkaneria - could you comment on what work is actually pending for this Spike in M1? It's a bit ambiguous to me.

justincc commented 5 years ago

For 2019Q2 this work has 2 components:

1) Ignore inconsequential - Mint box do not submit an analysis if they receive an inconsequential update (one which updates metadata that will not affect the pipeline output). This is under active development.

2) Forbid consequential - Mint box inform wranglers of the consequential metadata so that wranglers can avoid submitting these updates in Q2. In this document from @rexwangcc it looks like this is just taxon. Wranglers aware.

I don't know if estimates were done for these. It may be simplest to change the ticket title. This work is being done by mint box and either @morrisonnorman could chase status or @jkaneria could determine its milestone.

brianraymor commented 5 years ago

@jkaneria provided an update on slack We thought we had a solution that would work but that caused duplicate workflows, Tech leads are still exploring possible solutions. Once a singular solution is identified and we can break this out into further tickets. Is this exploration time-boxed to Q3 M1 which ends on July 28?

jkaneria commented 5 years ago

Yes! We think we will have a solution and tickets on what it will take to implement that solution by July 28. As always there is a possibility of slippage, if there is, it should be a couple days at most.

jkaneria commented 5 years ago

Part one of Justin's comment: "1) Ignore inconsequential - Mint box do not submit an analysis if they receive an inconsequential update (one which updates metadata that will not affect the pipeline output). This is under active development. "

We thought we had a possible solution for this portion, however our solution is failing in integration tests due to duplicate notifications. @samanehsan is looking at possible alternate solutions and that will complete this SPIKE. We believe we are on track to hit 7/28 with the investigation.

samanehsan commented 5 years ago

In the mean time, we can unblock updates to submissions by modifying our subscriptions in production to only receive integration test notifications. So we can still promote and test changes to production, and data wranglers can go ahead with fixing their submissions. Would that work for everyone?

jkaneria commented 5 years ago

Updated estimate for completion is end of milestone 2.

jkaneria commented 5 years ago

I would like to close this ticket in order to de-duplicate this and https://github.com/HumanCellAtlas/secondary-analysis/issues/654.