galaxyproject / tools-iuc

Tool Shed repositories maintained by the Intergalactic Utilities Commission
https://galaxyproject.org/iuc
MIT License
160 stars 429 forks source link

Clair3 fails because of missing output file #4987

Open bernt-matthias opened 1 year ago

bernt-matthias commented 1 year ago

In tests 2 and 3 the file phased_bam.bam is not created and Galaxy stumbles over metadata generation

setting metadata externally failed for HistoryDatasetAssociation 15: Traceback (most recent call last):
  File "/tmp/tmp1mff07rd/galaxy-dev/lib/galaxy/metadata/set_metadata.py", line 425, in set_metadata_portable
    set_meta(dataset, file_dict)
  File "/tmp/tmp1mff07rd/galaxy-dev/lib/galaxy/metadata/set_metadata.py", line 180, in set_meta
    set_meta_with_tool_provided(
  File "/tmp/tmp1mff07rd/galaxy-dev/lib/galaxy/metadata/set_metadata.py", line 118, in set_meta_with_tool_provided
    dataset_instance.datatype.set_meta(dataset_instance, **set_meta_kwds)
  File "/tmp/tmp1mff07rd/galaxy-dev/lib/galaxy/datatypes/binary.py", line 746, in set_meta
    pysam.index(dataset.file_name, index_file.file_name)
  File "/home/berntm/.planemo/gx_venv_3/lib/python3.9/site-packages/pysam/utils.py", line 69, in __call__
    raise SamtoolsError(
pysam.utils.SamtoolsError: 'samtools returned with error 1: stdout=, stderr=samtools index: "/tmp/tmp1mff07rd/job_working_directory/000/9/outputs/galaxy_dataset_88447f80-d7e1-4ecd-ac6d-91c3586faa20.dat" is in a format that cannot be usefully indexed\n'

Was wondering if --enable_phasing needs to be enabled and --no_phasing_for_fa disabled (but a first experiment was negative).

Also log complains about chromosome/sequence names which seem to be demo20 for bam and chr1 in bed.

Any chance that you could have a look @gallardoalba @mira-miracoli @pvanheus @wm75

gallardoalba commented 1 year ago

I'll have a look at it.

wm75 commented 1 year ago

Interesting. I just checked the test outputs from #4861 and it says 2022-10-28 08:35:54,994 WARNI [galaxy.tool_util.verify] Converting local (test-data) BAM to SAM failed: 'samtools returned with error 1: stdout=None, stderr=[main_samview] fail to read the header from "/tmp/tmp296s667aphased_bam_1.bam".\n'. Will compare BAM files, but succeeds then.

wm75 commented 1 year ago

Haha, so the phased_bam_1.bam test data is an empty file :)

wm75 commented 1 year ago

@gallardoalba you'd have to make sure that the phased bam contains at least a header. Perhaps you can copy over the one from the input file if and only if the file is empty?

gallardoalba commented 1 year ago

Perfect, thanks @wm75!

pvanheus commented 3 months ago

So it turns out that clair3 doesn't produce a phased_bam output but rather phased_vcf. This is addressed in https://github.com/galaxyproject/tools-iuc/pull/6195