Closed moutsian closed 2 days ago
The installation of htslib is a bit hassle, so we don't have it directly in the package. But, after cloning T1K package, you can manually download the htslib-1.15.1 to the folder. Then you can run "htslib=1 make". Then, T1K can work with CRAM file.
If you want to use another htslib version or change the htslib path, you can modify the Makefile and the header of the "alignments.hpp".
Thank you for your reply. Would it then run with the same command as for a bam file, e.g. run-t1k --preset hla-wgs \ -c /path/to/hla_dna_coord.fa -f /path/to/hla_dna_seq.fa -b /path/to/genome.cram --od /path/to/output \ --alleleDigitUnits 2 --alleleDelimiter ":" or does the bam-extractor function need to be modified?
Cheers, Loukas
Yes, the command would be the same as running with a bam file.
Does it require any particulr command? I don't see any change if only downloading this package and run the command.
@alicekiki It does not require particular command options. All you need to do is to compile T1K with htslib as instructed above.
Thank you! It worked out.
@mourisl, could you please help me out? I am encountering some errors while using a .cram file as an import. The .cram file I am using is from the 1000 Genomes Project. The .bam files are too large, and if possible, I would like to use the .cram files to run t1k.
Here is the code I ran: ./run-t1k -t 20 --preset hla-wgs -c ./t1k_resources/hlaidx/hlaidx_dna_coord.fa -f ./t1k_resources/hlaidx/hlaidx_dna_seq.fa -b path/to/cram/file/HG00138.final.cram -o ./../t1k_resources/t1k_output/NA19625
I encountered the following errors: [bam_header_read] EOF marker is absent. The input is probably truncated. [bam_header_read] invalid BAM binary header (this is not a BAM file). Can not open path/to/cram/file/HG00138.final.cram
I followed the instructions to install htslib and then ran htslib=1 make, which I think worked successfully (see attached screenshot).
I think all necessary files were created after running "htslib=1" make.
I have a question: doesn't the .cram file only store information that is different from that of the .fa file used during the alignment, resulting in a smaller file size compared to the .bam file? Wouldn't this mean that we need to use the .fa file (used during the alignment) somewhere in the run-t1k process?
@alicekiki, how did you manage to get this working?
Thanks, and I look forward to your reply.
@Bondada20 https://github.com/mourisl/T1K/blob/master/Makefile, i first modified the ninth row to 'htslib=1', and then run install. Try and see if there is a difference.
@alicekiki by "run install" do you mean to run "make" after modifying the ninth row to "htslip=1" or..? Also, the .cram file you are using, is it publicly available so I can test it?
Also, if you have time, can you test any of the .cram files from here "https://www.ebi.ac.uk/ena/browser/view/PRJEB31736" and see if it works?
@Bondada20 Modify Between step1 and step2. Dont have publicly available cram file for the moment.
Have you modified the header in the alignment.h file to match your htslib version?
I think for the CRAM file, htslib will internal try to download the reference genome and it will be cached for future use. If you already have the reference genome, maybe the tutorial in the end of https://www.htslib.org/workflow/cram.html is useful.
After some trial and error, I finally got it to work. See below for anyone having issues setting it up. Note: I am working on a server with restrictions on what I can do.
Thank you for sharing the steps. There is no need to explicitly edit line 9 of the Makefile. When running "make", run it like "htslib=1 make" should set the value of htslib to 1.
Hi,
This is a question rather than an issue. I was wondering if there is current support for cram files as input (instead of bams) or whether this is to come in a subsequent version?
Cheers, Loukas