ebi-ait / hca-ebi-wrangler-central

This repo is for tracking work related to wrangling datasets for the HCA, associated tasks and for maintaining related documentation.
https://ebi-ait.github.io/hca-ebi-wrangler-central/
Apache License 2.0
7 stars 2 forks source link

Fix linking GSE114727_BreastTumorMicroenvironment #1282

Open idazucchi opened 1 month ago

idazucchi commented 1 month ago

Context

see slack message from Hannes

In brief GSE114727_BreastTumorMicroenvironment is triggering test integration failure for Azul Reason: the analysis files are linked to sequence files and, also due to the size of the project, Azul takes a long time to reconstruct the graph

ingest dcp project

Task

Deadline

as soon as possible because the project will be removed from the portal otherwise

idazucchi commented 1 month ago

I've prepared a script to get the files' uuids and their input cell suspensions' uuids Using the uuids I can add the cell suspsions to the matrix creation process

But I can't figure out how to remove the fastq files from the process via the api - they are 406 so I can't do it manually

idazucchi commented 1 month ago

to remove a file from a process you have to use the put action

input_to_process_link = "https://api.ingest.archive.data.humancellatlas.org/files/<file_id>/inputToProcesses"
response = api.put(input_to_process_link)
response.raise_for_status()
idazucchi commented 1 month ago

export failed because of race conditions - both exporter threads were working on the same file and got stuck waiting for the other thread to stop writing to file

index:27/79 - Verifying upload of blob prod/7c75f07c-608d-4c4a-a1b7-b13d11c0ad31/staging_area.json. Waiting for 0.4 seconds...\n","stream":"stderr","time":"2024-08-02T10:16:45.327081114Z"}

{"log":"2024-08-02 10:16:45,215 - TerraExperimentExporter - INFO - submission_uuid:46321950-6b54-4e05-a7c5-d7cef1ad12a9 - export_job_id:66ab64b9f7e68c76e8fe3806 - project_uuid:7c75f07c-608d-4c4a-a1b7-b13d11c0ad31 - process_uuid:1c47090a-a1f8-4293-a257-ad30fb395ce4 - index:26/79 - Verifying upload of blob prod/7c75f07c-608d-4c4a-a1b7-b13d11c0ad31/staging_area.json. Waiting for 0.4 seconds...\n","stream":"stderr","time":"2024-08-02T10:16:45.216093149Z"}

{"log":"2024-08-02 10:16:45,089 - TerraExperimentExporter - INFO - submission_uuid:46321950-6b54-4e05-a7c5-d7cef1ad12a9 - export_job_id:66ab64b9f7e68c76e8fe3806 - project_uuid:7c75f07c-608d-4c4a-a1b7-b13d11c0ad31 - process_uuid:45d8ca4a-ca1d-4d7c-87cd-4cf6413b573b - index:27/79 - Verifying upload of blob prod/7c75f07c-608d-4c4a-a1b7-b13d11c0ad31/staging_area.json. Waiting for 0.2 seconds...\n","stream":"stderr","time":"2024-08-02T10:16:45.089850849Z"}

{"log":"2024-08-02 10:16:44,985 - TerraExperimentExporter - INFO - submission_uuid:46321950-6b54-4e05-a7c5-d7cef1ad12a9 - export_job_id:66ab64b9f7e68c76e8fe3806 - project_uuid:7c75f07c-608d-4c4a-a1b7-b13d11c0ad31 - process_uuid:1c47090a-a1f8-4293-a257-ad30fb395ce4 - index:26/79 - Verifying upload of blob prod/7c75f07c-608d-4c4a-a1b7-b13d11c0ad31/staging_area.json. Waiting for 0.2 seconds...\n","stream":"stderr","time":"2024-08-02T10:16:44.985814481Z"}

{"log":"2024-08-02 10:16:44,825 - TerraExperimentExporter - INFO - submission_uuid:46321950-6b54-4e05-a7c5-d7cef1ad12a9 - export_job_id:66ab64b9f7e68c76e8fe3806 - project_uuid:7c75f07c-608d-4c4a-a1b7-b13d11c0ad31 - process_uuid:1c47090a-a1f8-4293-a257-ad30fb395ce4 - index:26/79 - Writing file: prod/7c75f07c-608d-4c4a-a1b7-b13d11c0ad31/staging_area.json\n","stream":"stderr","time":"2024-08-02T10:16:44.825287534Z"}

{"log":"2024-08-02 10:16:44,937 - TerraExperimentExporter - INFO - submission_uuid:46321950-6b54-4e05-a7c5-d7cef1ad12a9 - export_job_id:66ab64b9f7e68c76e8fe3806 - project_uuid:7c75f07c-608d-4c4a-a1b7-b13d11c0ad31 - process_uuid:45d8ca4a-ca1d-4d7c-87cd-4cf6413b573b - index:27/79 - Writing file: prod/7c75f07c-608d-4c4a-a1b7-b13d11c0ad31/staging_area.json\n","stream":"stderr","time":"2024-08-02T10:16:44.93793913Z"}

exported with a script - see here but note that it requires GPC credentials to write to the staging area

to do