sanger-tol / genomenote

This Nextflow DSL2 pipeline takes aligned HiC reads, creates contact maps and a table of statistics.
https://pipelines.tol.sanger.ac.uk/genomenote
MIT License
19 stars 2 forks source link

meta.id confusion #79

Open muffato opened 10 months ago

muffato commented 10 months ago

Description of the bug

On the public_dev branch, the input fasta is called GCA_946965045.1.fasta.gz and the Hi-C CRAM file GCA_946965045.1.unmasked.hic.uoEpiScrs1.subsampled.cram, but the assembly parameter is set to GCA_946965045.2. After a run of the test profile, 1) I get these two files in the genome_note/ directory:

2), GCA_946965045.1.csv contains:

Accession,GCA_946965045.1

3), and many more intermediate files are also named GCA_946965045.1.*, indicating that the pipeline is confused about what is meta.id.

The input file names can be different from the accession number etc, but I'd expect the outputs of the pipeline to be consistently named.

Command used and terminal output

nextflow run sanger-tol/genomenote/ -profile test,singularity -r public_dev

Relevant files

No response

System information

Nextflow 23.04.1-5866 from our central installation