nci-hcmi-catalog / portal

HCMI Searchable Catalog Portal
https://hcmi-searchable-catalog.nci.nih.gov/
BSD 3-Clause "New" or "Revised" License
6 stars 2 forks source link

Admin Override for MAF input #936

Closed rosibaj closed 3 years ago

rosibaj commented 3 years ago

The following is an explanation of how a model in HCMI has data imported from the GDC:

  1. Starting with the model name in the CMS, for example “HCM-CSHL-0056-C18”, we make a request to the GDC API looking for Cases with the Submitter ID (cases.submitter_id) matching that model name.
  2. Once we have a matching Case for a given model name, we find all Files for that case which are of the “Open” Access and “MAF” Data Format (files.access, files.data_format).
  3. Because of how the GDC GraphQL API works, this gives us two lists: one list of matching Cases (in this example, the list only has one Case, for the one model name we have searched for), and one list of matching Files.
  4. The next step is to iterate through the two lists and combine the data such that for a given model name, we can iterate through its Files and determine if there are NGCM MAFs which we can import

Examining the model that was pointed out in the comment, in the HCMI Searchable Catalog HCM-CSHL-0434-C24-A and HCM-CSHL-0434-C24-B are two separate entities with unique identifiers.

To support import of MAFS for this type of case (multiple model with no matching case in GDC)

Zeplin Links:

rosibaj commented 3 years ago

Tested June 23, 2021 10:39

Issues

mistryrn commented 3 years ago

Regarding the above feedback points:

As for the first point, I believe that incorrect URL was already present from the Google Sheet used to bulk import the data: https://docs.google.com/spreadsheets/d/1qGTjLjWBS1epMK_qB-FWk_VyhR5B7kviPrSKm-sFYBE/edit#gid=1253853088

You can see the incorrect case ID present for this model on that sheet.