rwdavies / QUILT

GNU General Public License v3.0
45 stars 10 forks source link

question about CRAM input format #21

Open 2584589300 opened 1 year ago

2584589300 commented 1 year ago

Hi! When I try to use CRAM files as input, I got the following message:

Error in get_bam_name_and_maybe_convert_cram(iBam, bam_files, cram_files, : Both bam_files and cram_files are empty Calls: QUILT ... loadBamAndConvert -> get_bam_name_and_maybe_convert_cram Execution halted

And when I convert CRAM files to BAM files, it can work. I notice that in https://github.com/rwdavies/QUILT/blob/master/QUILT/R/functions.R

loadBamAndConvert( iBam = iSample, L = L, pos = pos, nSNPs = nSNPs, bam_files = bam_files, iSizeUpperLimit = iSizeUpperLimit, bqFilter = bqFilter, chr = chr, N = length(sampleNames), downsampleToCov = downsampleToCov, sampleNames = sampleNames, inputdir = tempdir, regionName = regionName, tempdir = tempdir, chrStart = chrStart, chrEnd = chrEnd, chrLength = NA, save_sampleReadsInfo = TRUE, use_bx_tag = use_bx_tag, bxTagUpperLimit = bxTagUpperLimit, default_sample_no_read_behaviour = "return_null" )

there is no "cram_files = cram_files", is that a mistake? Thanks!

rwdavies commented 1 year ago

Thanks looks like I forgot to pass through the arguments indeed.

atrigila commented 10 months ago

Hi @rwdavies! I would like to use QUILT in Nextflow. I have built a nf-core module that I would like to share with the community. However, I have noticed that it fails when I test it with cram inputs. As per Nextflow's guidelines, I am currently using the QUILT's version in Biocontainers. Is it possible that this version does not have the corrected code? I have also tested it outside Nextflow and I get the same error:

Code used to run docker interactively:

docker run -it -v /crams:/data quay.io/biocontainers/r-quilt:1.0.4--r43h06b5641_2 /bin/bash

Code used to run QUILT:

R -e 'library("QUILT"); 

QUILT(chr="chr20",cramlist="/data/cramlist.txt",
reference_haplotype_file="/data/ALL.chr20_GRCh38.genotypes.20170504.chr20.2000001.2100000.noNA12878.hap.gz", nGen=100,regionStart=2000001,regionEnd=2100000,buffer=1000,outputdir="quilt_output",reference_legend_file="/data/ALL.chr20_GRCh38.genotypes.20170504.chr20.2000001.2100000.noNA12878.legend.gz",reference="Homo_sapiens_assembly38.fasta")'

Error:

[2023-08-25 17:37:23] Get CRAM sample names
[2023-08-25 17:37:23] Done getting CRAM sample names
[2023-08-25 17:37:23] Warning, there are repeat sample names
[2023-08-25 17:37:23] There are 3227 SNPs in this region
[2023-08-25 17:37:24] Imputing sample: 1
Error in get_bam_name_and_maybe_convert_cram(iBam, bam_files, cram_files,  : 
  Both bam_files and cram_files are empty
Calls: QUILT ... loadBamAndConvert -> get_bam_name_and_maybe_convert_cram
Execution halted

Thank you!

rwdavies commented 9 months ago

Hey, thanks, sorry for the slow reply, I'm just checking this now

Looks like it is passing, I just need to push a new version, and add to bioconductor. I'll work on that now / the next few days

atrigila commented 9 months ago

Thank you! :)