Closed sooheon closed 5 months ago
The PANOPLY v1_4 release— which has now been made public— contains a number of Startup-Notebook patches. If the same issue still occurs in a v1_4 workspace, we can troubleshoot further with the logfile, which can be accessed from the terminal:
The logfile can be copied from the notebook VM to the workspace Data File folder with the following command:
gsutil cp $WORKSPACE_NAME/edit/runspace/logfile_mn.txt $WORKSPACE_BUCKET
Also, the shorter runtime of run_panda()
is normal and should only take 5-10 minutes to complete successfully. The function was heavily re-optimized after v1_2, cutting the runtime from a few hours to a few minutes-- but the tutorial documentation still reflects v1_0.
I've now tried with published V1_4, and sample_set
is populated (though not multiple tables as indicated in tutorial, I assume this is due to version difference?)
However, moving on to the workflow, all 5 tasks in panoply_main
failed.
Execution log: workflow.151a5dfa-f5b2-478a-bf0f-be053843e850.log
The majority of these errors look to be from parameter-mismatches. If this is occurring with the tutorial dataset (tutorial-brca-input.zip
), it is likely that the master-parameters_tutorial.yaml
in that zip file has not yet been updated to reflect v1_4. We are working on regenerating that tutorial zip-- but in the meantime, excluding the master-parameters_tutorial.yaml
from the zip file (or, perhaps more simply, declaring master-parameters_tutorial.yaml
as some other dummy file-category with 0
when running panda_input()
in the notebook) will cause the notebook to generate a master-parameters.yaml
file with the appropriate defaults for PANOPLY v1.4.
Update: The tutorial and tutorial-brca-input.zip
have been updated to reflect PANOPLY v1.4 (2024-04-05)
Followed tutorial at https://github.com/broadinstitute/PANOPLY/wiki/PANOPLY-Tutorial step by step, the only change being to use PANOPLY_Production_Pipelines_v1_3 workspace and
broadcptac/panda:1_3
as the VM image. At the end of the run, the Data tab looks like the following, onlysample_set
table generated, and no data populated.It does not take a few hours, as the tutorial notes, but finishes within a minute or two.