HumanCellAtlas / matrix-service

DCP Expression Matrix Service
MIT License
12 stars 4 forks source link

Generate per-project matrix service output files in Data Browser S3 bucket #351

Closed theathorn closed 5 years ago

theathorn commented 5 years ago

Moved from https://app.zenhub.com/workspaces/dcp-5ac7bcf9465cb172b77760d9/issues/databiosphere/azul/1098

brianraymor commented 5 years ago

Isn't this a duplicate of #331 which has already been addressed?

theathorn commented 5 years ago

There's a requirement to generate the matrix output files in the specified Data Browser S3 bucket with the correct Content-Dispostion header as per https://app.zenhub.com/workspaces/dcp-5ac7bcf9465cb172b77760d9/issues/databiosphere/azul/1098 such that they can be directly downloaded by the user. #331 was closed 3 weeks ago which appears to pre-date this agreement (made during the Data Browser meeting of 7/15/19).

theathorn commented 5 years ago

You also need to set cache control headers as per https://app.zenhub.com/workspaces/dcp-5ac7bcf9465cb172b77760d9/issues/databiosphere/azul/1146

mckinsel commented 5 years ago

Just some clarification of what this ticket means: we have a plan to automate the creation of project matrices as part of ETL, and that is recorded in #352 . After further discussions with the data browser team, we decided it made sense to directly put the results of that automation into an S3 bucket that the browser reads to populate its matrix download buttons. But, there is a little more work for that than just copying results to a bucket. There is an expected structure and some header that need to be set. This issue represents that work.

brianraymor commented 5 years ago

@mckinsel - I think that we should close this issue since we have tracking issues for specific projects requiring manual intervention in Q3. We can then focus on #352 in Q4M1.

brianraymor commented 5 years ago

No objections to closing. Until #352 is addressed, individual issues will be opened for incoming Barcelona data sets for easier tracking by DataOps.