Clinical-Genomics / cg

Glue between Clinical Genomics apps
8 stars 2 forks source link

Gens not available for some cases #3314

Closed ElekCMMS closed 1 month ago

ElekCMMS commented 4 months ago

Hi,

For case 15126 and 15006-quattro I'm not able to access GENS in the structural variant page. Other users have tried with these cases as well without sucess and other cases works fine for me.

I get the following message when clicking on Gens: Skärmklipp

Thank you!

dnil commented 4 months ago

Thank you for reporting! The case in question has got a somewhat recent rerun (2024-05-08) of that 2023-05-05 version that gens tries to load. The gens db has not been updated, but the old gens files have been removed. Presumably we have an issue with how we handle these cases on cg upload. Most likely the issue exists for all reruns, but it's not always the case that the old files have been removed. Thank you for reporting! Do let us know if you are in any kind of hurry with this sample, and we can manually reload the gens files for you even before the general issue is fixed.

I'll forward this to the CG repo to check the procedure for gens re-uploads, but it also ties into https://github.com/Clinical-Genomics/gens/issues/71.

ElekCMMS commented 4 months ago

Hi! There is no hurry with these samples, thank you for the update!

Vince-janv commented 2 months ago

To be refined in separate meeting

@seallard will prepare and distribute info before meeting

seallard commented 1 month ago

Reproducing the bug

Try to load the following view https://gens.scilifelab.se/15006-I-1A?region=9:73944888-137989991&variant=e6a336c39982098563f71d62b19c5f30&genome_build=37&case_id=livingimpala&individual_id=ACC9353A1

The view can be found from here https://scout.scilifelab.se/cust003/15006-quattro/sv/variants/e6a336c39982098563f71d62b19c5f30

Description

The observed bug is due to gens attempting to load files which do not exist - a re-run has been done for the case and the old files have been deleted, but gens has not been updated with the new file paths.

More worryingly, this bug only appeared due to the old files for the previous run having been deleted. Otherwise, the users would have seen the old data without knowledge of it being an older run. How bad is that?

Given that the tool seems to present meta data related to the raw sequencing data (such as sequencing depth), I assume that this is not too critical since a re-run (to my knowledge) usually does not involve re-sequencing and the underlying raw sequencing data should be the same. If this is not the case or the tool actually presents data derived from the analysis run, the bug might be really bad - potentially misleading the person interpreting the data.

cg and gens interaction

There is a GensAPI class in cg which invokes a Gens CLI commands to "load" sample files into gens. This API is used in the BalsamicUploadAPI, which in turn calls the cg upload gens command and uploads the most recent analysis files per sample in the case.

To summarize: there is a gens web API running on a VM and a gens CLI running on Hasta, which the cg CLI interacts with to upload sample analysis files from housekeeper to gens.

Potential causes of the bug

Vince-janv commented 1 month ago

Solved by https://github.com/Clinical-Genomics/cg/pull/3389. Running cg upload gens <case_id> (which is invoked in the upload command) now overwrites previous entries in the gens database.

The two cases mentioned in the issue have been re-uploaded