pachterlab / kb_python

A wrapper for the kallisto | bustools workflow for single-cell RNA-seq pre-processing
https://www.kallistobus.tools/
BSD 2-Clause "Simplified" License
144 stars 23 forks source link

Instructions for use with single-end SMART-seq data #98

Closed winni2k closed 3 years ago

winni2k commented 3 years ago

Describe the issue I would like to run kb count on a list of FASTQs from a SMART-seq (v1, bulk RNA-seq) experiment. The reads are single-end. The kb count instructions specify that counting only works with paired FASTQs. Is this strictly true, or is there a way to also run the command using single-end data? Also, why does the SMARTSEQ technology not work with the lamanno workflow?

What is the exact command that was run?

kb count --h5ad            -i index.idx            -g t2g.txt            -x SMARTSEQ            -o  $(dirname results/lamanno/human/kb_count.h5)            -c1 results/lamanno/human/cdna_t2c.txt            -c2 results/lamanno/human/intron_t2c.txt            --workflow lamanno            --filter bustools             results/data/SRR2144273/SRR2144273.fa.gz

Command output (with --verbose flag)

[2021-02-24 13:26:49,643]   DEBUG Printing verbose output
[2021-02-24 13:26:49,643]   DEBUG kallisto binary located at /home/warkre/miniconda3/envs/kb-python/lib/python3.8/site-packages/kb_python/bins/linux/kallisto/kallisto
[2021-02-24 13:26:49,643]   DEBUG bustools binary located at /home/warkre/miniconda3/envs/kb-python/lib/python3.8/site-packages/kb_python/bins/linux/bustools/bustools
[2021-02-24 13:26:49,643]   DEBUG Creating results/lamanno/human/tmp directory
[2021-02-24 13:26:49,644]   DEBUG Namespace(c1='results/lamanno/human/cdna_t2c.txt', c2='results/lamanno/human/intron_t2c.txt', cellranger=False, command='count', dry_run=False, fastqs=['results/data/SRR2144273/SRR2144273.fa.gz'], filter='bustools', g='t2g.txt', h5ad=True, i='index.idx', keep_tmp=False, lamanno=False, list=False, loom=False, m='4G', mm=False, no_inspect=False, no_validate=False, nucleus=False, o='results/lamanno/human', overwrite=False, report=False, t=8, tcc=False, tmp=None, verbose=True, w=None, workflow='lamanno', x='SMARTSEQ')
usage: kb [-h] [--list] <CMD> ...
kb: error: Technology `SMARTSEQ` can not be used with workflow lamanno.
[2021-02-24 13:26:49,644]   DEBUG Removing results/lamanno/human/tmp directory
Lioscro commented 3 years ago

Spliced and unspliced counts are only supported for droplet-based technologies currently (@sbooeshaghi please correct me if this has changed since the last time I looked into this). Also, paired-end reads aren't supported in the kb wrapper yet (but is possible by running kallisto pseudo manually).

github-actions[bot] commented 3 years ago

This issue is stale because it has been open 30 days with no activity. Remove stale label or comment or this will be closed in 5 days