pachterlab / kb_python

A wrapper for the kallisto | bustools workflow for single-cell RNA-seq pre-processing
https://www.kallistobus.tools/
BSD 2-Clause "Simplified" License
147 stars 23 forks source link

ERROR kallisto: unrecognized option `--kmer #96

Closed khushboojindal closed 3 years ago

khushboojindal commented 3 years ago

When I input the following code in terminal in mac:

kb count /Users/khush/index.idx_cdna,/Users/khush/index.idx_intron.0,/Users/khus/index.idx_intron.1,/Users/khush/index.idx_intron.2 -g /Users/khush/t2g.txt -x 10xv3 --workflow lamanno --loom -c1 /Users/khush/cdna_t2c.txt -c2 /Users/khush/intron_t2c.txt /Users/khush/Downloads/R1.fastq.gz /Users/khush/Downloads/R2.fastq.gz /Users/khush/Downloads/I1.fastq.gz

It showed the error below:

ERROR kallisto 0.46.2 Generates BUS files for single-cell sequencing

Usage: kallisto bus [arguments] FASTQ-files

Required arguments: -i, --index=STRING Filename for the kallisto index to be used for pseudoalignment -o, --output-dir=STRING Directory to write output to -x, --technology=STRING Single-cell technology used

Optional arguments: -l, --list List all single-cell technologies supported -t, --threads=INT Number of threads to use (default: 1) -b, --bam Input file is a BAM file -n, --num Output number of read in flag column (incompatible with --bam) --verbose Print out progress information every 1M proccessed reads kallisto: unrecognized option `--kmer'

Error: Number of files (3) does not match number of input files required by technology 10XV3 (2) [2021-02-10 10:47:49,967] ERROR An exception occurred Traceback (most recent call last): File "/Users/khush/miniconda/lib/python3.8/site-packages/kb_python/main.py", line 846, in main COMMAND_TO_FUNCTION[args.command](parser, args, temp_dir=temp_dir) File "/Users/khush/miniconda/lib/python3.8/site-packages/kb_python/main.py", line 206, in parse_count count_velocity( File "/Users/khush/miniconda/lib/python3.8/site-packages/kb_python/count.py", line 1497, in count_velocity bus_result = kallisto_bus_split( File "/Users/khush/miniconda/lib/python3.8/site-packages/kb_python/count.py", line 193, in kallisto_bus_split kallisto_bus( File "/Users/khush/miniconda/lib/python3.8/site-packages/kb_python/validate.py", line 112, in inner results = func(*args, *kwargs) File "/Users/khush/miniconda/lib/python3.8/site-packages/kb_python/count.py", line 149, in kallisto_bus run_executable(command) File "/Users/khush/miniconda/lib/python3.8/site-packages/kb_python/dry/init.py", line 24, in inner return func(args, **kwargs) File "/Users/khush/miniconda/lib/python3.8/site-packages/kb_python/utils.py", line 233, in run_executable raise sp.CalledProcessError(p.returncode, ' '.join(command)) subprocess.CalledProcessError: Command '/Users/khushminiconda/lib/python3.8/site-packages/kb_python/bins/darwin/kallisto/kallisto bus -i /Users/khush/index.idx_cdna -o ./tmp/bus_part0 -x 10xv3 -t 8 --num --kmer /Users/khush/Downloads/R1.fastq.gz /Users/khush/Downloads/R2.fastq.gz /Users/khush/Downloads/I1.fastq.gz' returned non-zero exit status 1.

Lioscro commented 3 years ago

Hi, @khushboojindal, I notice that you are providing three FASTQ files as input, while the 10xv3 technology requires two, where the first contains the biological cDNA reads and the second contains the reads with the barcode and UMI sequences.

Based on the filenames for the three FASTQs, I suspect you can just provide the R1.fastq.gz and R2.fastq.gz files (and omit the I1.fastq.gz.

As for the --kmer option, could you let me know what version of kb you are using? You can check this by just running kb without any arguments.

khushboojindal commented 3 years ago

Hi @Lioscro ,

Thanks for your response.

R1.fastq.gz , R2.fastq.gz and L1.fastq.gz are from the same sample but loaded in different lanes in 10x. Is there a way to load all three or not sure if we can skip L1 here:/

Kb version:

(base) khushbl@BK-MAC ~ % kb info kb_python 0.25.1 kallisto: 0.46.2 bustools: 0.40.0

Lioscro commented 3 years ago

Does your third FASTQ start with an I or an L? I ask this because Illumina sequencers always output a third index sequence FASTQ that contains an I instead of R.

Could you post the first few reads from each file? You can do so with the following command.

gunzip -c R1.fastq.gz | head -n 12

and the analogous for the other two files.

khushboojindal commented 3 years ago

Thanks for your prompt response!

Yes, The third file start with I .

R1.fastq.gz

@NB502117:158:HJ337BGX7:1:11101:2571:1040 1:N:0:GTTCCTCA AGGTCNGCACGCTTTCCGCCAGATGT + AAAAA#EEAEEEEAEEEEEEEEE/AA @NB502117:158:HJ337BGX7:1:11101:14979:1040 1:N:0:GTTCCTCA CCATTNGTCACATACGGTCTACTTCG + AAAAA#EEEEEEEEEEEEEEEEEEEE @NB502117:158:HJ337BGX7:1:11101:7745:1040 1:N:0:GTTCCTCA AAGGTNCAGTGGTAATCCCGTAGACG + AAAAA#EEEEEEEEEEEEEEEEEEE<

R2.fastq.gz

@NB502117:158:HJ337BGX7:2:11101:4877:1040 2:N:0:GTTCCTCA CTACACACCTTATCCCCATACNAGTTATTATCGAAACAATCANCCTACTNNTTCANNNNNTAGCCCTNGNCGTACNCCTAACNGCTAACATTACTGAA + AAAAA///EEEA//AEEEEEE#AEA/EEEEE/EAE/A/6/EE#//EE/E##EE/E#####EEEEEEE#/#//EE<#</E<EE#/EE/E<AEEEAE//A @NB502117:158:HJ337BGX7:2:11101:17921:1040 2:N:0:GTTCCTCA GGGGGGGGGGGGGGGGGGGGGNGGGGGGGGGGGGGGGGGGGGNGGGGGGNNAAGTNNNNNTGTTTTTNTNATTTANTATTTTNATTAATAATAAAAAA + AAAAAEEEEEEEEEEEAAAEE#EEEEA66666///6666/6/#//////##////#####///////#/#/////#//////#/////////////// @NB502117:158:HJ337BGX7:2:11101:8646:1040 2:N:0:GTTCCTCA GTAAAAGCAGTCCTACTCTTCNACACTAGGAAGGCTTTACTTNTTTTAANTGGTGNAGNNGGAAAATNGNACATTNCATACTNAATTGGGTCCTTGTC + AAAAAEEEEEEEEEEEEEEEE#EEEEEEEEEAEEEEEEEAEE#EEEEEE#EEEEE#EE##EEEAEEE#/#EEEEE#/EEEEE#EEEEAEEEEEEEEE<

I1.fastq.gz

@NB502117:158:HJ337BGX7:1:11101:2571:1040 1:N:0:GTTCCTCA GTTCCTCA + AAAAAEEE @NB502117:158:HJ337BGX7:1:11101:14979:1040 1:N:0:GTTCCTCA GTTCCTCA + AAAAAEEE @NB502117:158:HJ337BGX7:1:11101:7745:1040 1:N:0:GTTCCTCA GTTCCTCA + AAAAAEEE

Lioscro commented 3 years ago

Hi, @khushboojindal, Thanks for looking into that. I believe the I1.fastq.gz is the Illumina index read that should not be input into kb. Could you try re-running kb without this file and let me know if you experience any other problems?

github-actions[bot] commented 3 years ago

This issue is stale because it has been open 30 days with no activity. Remove stale label or comment or this will be closed in 5 days