Closed ElekCMMS closed 1 month ago
Thank you for reporting! The case in question has got a somewhat recent rerun (2024-05-08) of that 2023-05-05 version that gens tries to load. The gens db has not been updated, but the old gens files have been removed. Presumably we have an issue with how we handle these cases on cg upload
. Most likely the issue exists for all reruns, but it's not always the case that the old files have been removed. Thank you for reporting! Do let us know if you are in any kind of hurry with this sample, and we can manually reload the gens files for you even before the general issue is fixed.
I'll forward this to the CG repo to check the procedure for gens re-uploads, but it also ties into https://github.com/Clinical-Genomics/gens/issues/71.
Hi! There is no hurry with these samples, thank you for the update!
@seallard will prepare and distribute info before meeting
Try to load the following view https://gens.scilifelab.se/15006-I-1A?region=9:73944888-137989991&variant=e6a336c39982098563f71d62b19c5f30&genome_build=37&case_id=livingimpala&individual_id=ACC9353A1
The view can be found from here https://scout.scilifelab.se/cust003/15006-quattro/sv/variants/e6a336c39982098563f71d62b19c5f30
The observed bug is due to gens attempting to load files which do not exist - a re-run has been done for the case and the old files have been deleted, but gens has not been updated with the new file paths.
More worryingly, this bug only appeared due to the old files for the previous run having been deleted. Otherwise, the users would have seen the old data without knowledge of it being an older run. How bad is that?
Given that the tool seems to present meta data related to the raw sequencing data (such as sequencing depth), I assume that this is not too critical since a re-run (to my knowledge) usually does not involve re-sequencing and the underlying raw sequencing data should be the same. If this is not the case or the tool actually presents data derived from the analysis run, the bug might be really bad - potentially misleading the person interpreting the data.
There is a GensAPI
class in cg which invokes a Gens CLI commands to "load" sample files into gens. This API is used in the BalsamicUploadAPI
, which in turn calls the cg upload gens
command and uploads the most recent analysis files per sample in the case.
To summarize: there is a gens web API running on a VM and a gens CLI running on Hasta, which the cg CLI interacts with to upload sample analysis files from housekeeper to gens.
cg upload gens
command https://github.com/Clinical-Genomics/cg/blob/14c57a4004a6c204ca7907f79ba6c555df1dd934/cg/cli/upload/gens.py#L26load
command https://github.com/Clinical-Genomics/gens/blob/7110cf483296c4dda35d356becf7805c654e9fda/gens/db/samples.py#L33load
commandSolved by https://github.com/Clinical-Genomics/cg/pull/3389. Running cg upload gens <case_id>
(which is invoked in the upload command) now overwrites previous entries in the gens database.
The two cases mentioned in the issue have been re-uploaded
Hi,
For case 15126 and 15006-quattro I'm not able to access GENS in the structural variant page. Other users have tried with these cases as well without sucess and other cases works fine for me.
I get the following message when clicking on Gens:
Thank you!