ebi-ait / dcp-ingest-central

Central point of access for the Ingestion Service of the HCA DCP
Apache License 2.0
0 stars 0 forks source link

Additional project field - visualisation portals, to appear both in backoffice and Catalogue #670

Closed gabsie closed 2 years ago

gabsie commented 2 years ago

As a Data contributor/wrangler, I would like to add the link to my project in a supported Analysis Portal (e.g. Cellxgene/SCEA/UCSC browser), so that other scientists can view my data in the visualisation tool

Acceptance Criteria

To move to a separate ticket (and sprint)

MightyAx commented 2 years ago

Just an idea: I believe if the cxgene link is added as an accession it will be displayed in project catalogue (but perhaps this requires a HCA metadata version change)

gabsie commented 2 years ago

please refer to 409

amnonkhen commented 2 years ago

I created a wireframe

MightyAx commented 2 years ago

The recommendation here is to store links to Cellxgene / SCEA / UCSC in the project metadata schema

project.supplementary_links

External link(s) pointing to code, supplementary data files, or analysis files associated with the project which will not be uploaded. example: https://github.com/czbiohub/tabula-muris; http://celltag.org/

The suggested change is to highlight supported URLs in the project catalogue.

@gabs / @ESapenaVentura Please can I have examples of the following urls, so that I can determine how we detect them:

  1. Cellxgene
  2. SCEA
  3. UCSC
gabsie commented 2 years ago

Hi, @MightyAx

So for UCSC browser links: https://cells.ucsc.edu/?ds=fetal-thymus https://cells.ucsc.edu/?ds=lifespan-nasal-atlas CellxGene https://cellxgene.cziscience.com/collections/0434a9d4-85fd-4554-b8e3-cf6c582bb2fa https://cellxgene.cziscience.com/collections/6f6d381a-7701-4781-935c-db10d30de293

SCEA: https://www.ebi.ac.uk/gxa/sc/experiments/E-CURD-98/results/tsne https://www.ebi.ac.uk/gxa/sc/experiments/E-MTAB-11011/results/tsne

ESapenaVentura commented 2 years ago

SCEA: https://www.ebi.ac.uk/gxa/sc/experiments/E-CURD-98/results/tsne https://www.ebi.ac.uk/gxa/sc/experiments/E-MTAB-11011/results/tsne

Just adding to this, that for the regex, don't always expect the "tsne" part at the end. It won't always end like that!

MightyAx commented 2 years ago

My current plan is to consider any supplementary link that contains: cellxgene.cziscience.com/collections as a cellxgene link ebi.ac.uk/gxa/sc/experiments as a Single Cell Expression Atlas link cells.ucsc.edu as a UCSC Cell Browser link

We can do more regex manipulation to figure out:

But then we should probably be validating the data on input and storing it separately also. If that's in scope for this sprint it's a much bigger task.

Also, any opinions on the use of Logos versus the following text? cellxgene / Single Cell Expression Atlas / UCSC Cell Browser If you would like Logo's please provide them.

ipediez commented 2 years ago

@gabsie and @ESapenaVentura to take a look and give feedback

MightyAx commented 2 years ago

Current progress showing filtering and output for SCEA. The code also works for cellxgene and UCSC but no prod data features those in supplementary links.

It seems like we Will need some regex after all, to display the accessions.

Screenshot 2022-04-22 at 11.05.55.png

My big question then is how to display cellxgene since they are massive UUIDs and this cell of the table is already massively overflowing, perhaps that's where to bring in the logo.

To help with the overflow I'll just use "SCEA" rather than "Single Cell Expression Atlas"

MightyAx commented 2 years ago

Implemented accession regex Screenshot 2022-04-22 at 13.11.30.png

The accession cell is getting very full for some records.

MightyAx commented 2 years ago

added external link signifier and started using this formatting everywhere (just ENA so far) Screenshot 2022-04-22 at 16.23.25.png

prabh-t commented 2 years ago

Alexie - Feature complete but some refactoring remaining.

MightyAx commented 2 years ago

The list-of-links component has been complete, awaiting re-review Screenshot 2022-04-25 at 11.45.46.png

MightyAx commented 2 years ago

Deployment in the EBI Web Dev Environment Failed, but I don't know how to follow this up!

MightyAx commented 2 years ago

The test step that is failing (because yarn is not installed) was added by @jacobwindsor's recent cleanup, removing it for now.

MightyAx commented 2 years ago

Deployment Successful, ready for user testing: https://wwwdev.ebi.ac.uk/humancellatlas/project-catalogue/

MightyAx commented 2 years ago

Known Issue: TSV generation shows [Object object] instead of accession. Will fix tomorrow.

ESapenaVentura commented 2 years ago

Also, any opinions on the use of Logos versus the following text? cellxgene / Single Cell Expression Atlas / UCSC Cell Browser If you would like Logo's please provide them.

I think it's fine with the names for now. Adding the logos would require a lot more work (we would need to contact each of the institutions for their logos etc)

My big question then is how to display cellxgene since they are massive UUIDs and this cell of the table is already massively overflowing, perhaps that's where to bring in the logo.

I think for now we just need to stick with the UUID, cellxgene have no open APIs where we could query a short identifier for the project

I tried to download the TSV for the project that I added the SCEA/cellxgene/UCSC cell browser accessions to test and the resulting TSV had no headers/entries for those accessions, I assume this is the error you comment above or is it different?

Other than that, I've been testing around and adding a couple of accessions to projects and it works great!

gabsie commented 2 years ago

@gabsie to check as well.

MightyAx commented 2 years ago

Project Catalogue in wwwdev not being deployed to production. Operations ticket to track with EBI Gitlab Team: https://github.com/ebi-ait/hca-ebi-wrangler-central/issues/788

MightyAx commented 2 years ago

Project Catalogue deployment to Prod successful!

gabsie commented 2 years ago

Hey Alexie, this works great when I tried the different links. Small edit you might have to do with regards to your formula for getting the cell browser links. I have seen in the sheet that their links can sometimes be:

https://gut-cell-atlas.cells.ucsc.edu https://covid19-influenza-response.cells.ucsc.edu

so maybe you also have to consider links that end in ... cells.ucsc.edu, not only start with that (I can see this displaying fine: https://cells.ucsc.edu/?ds=adultPancreas)

I have now entered details for 4 projects, you can see them by filtering by cellxgene portal

Gabs

MightyAx commented 2 years ago

image Change in review

MightyAx commented 2 years ago

Ready for review on dev: https://wwwdev.ebi.ac.uk/humancellatlas/project-catalogue/ can release to prod when you approve.

ESapenaVentura commented 2 years ago

Second component to allow for a bit more links in staging yet - But ready to be released

@gabsie to take a look

gabsie commented 2 years ago

This has been checked and is OK!

MightyAx commented 2 years ago

Deployed to PROD