pachterlab / kb_python

A wrapper for the kallisto | bustools workflow for single-cell RNA-seq pre-processing
https://www.kallistobus.tools/
BSD 2-Clause "Simplified" License
147 stars 23 forks source link

Issue with Combining matrices #80

Closed JLimen closed 4 years ago

JLimen commented 4 years ago

Hi,

I am trying to generate loom files to use downstream with velocito. The count command ran well and generated the spliced and unspliced matrices but seem to have failed at the point of combining them. I checked and I can import both anndata and loopy libraries in python. So I am not sure what the issue is.

Thank you very much for any insight in this.

The exact command I ran was:

kb count -i Mus_musculus_index.idx -g t2g.txt -x 10xv2 -o Scell1 -- workflow lamanno --loom -c1 cdna_t2c.txt -c2 intron_t2c.txt Scell1_GEX_S1_L001_R1_001.fastq.gz Scell1_GEX_S1_L001_R2_001.fastq.gz Scell1_GEX_S1_L002_R1_001.fastq.gz Scell1_GEX_S1_L002_R2_001.fastq.gz Scell1_GEX_S2_L001_R1_001.fastq.gz Scell1_GEX_S2_L001_R2_001.fastq.gz Scell1_GEX_S2_L002_R1_001.fastq.gz Scell1_GEX_S2_L002_R2_001.fastq.gz Scell1_GEX_S3_L001_R1_001.fastq.gz Scell1_GEX_S3_L001_R2_001.fastq.gz Scell1_GEX_S3_L002_R1_001.fastq.gz Scell1_GEX_S3_L002_R2_001.fastq.gz Scell1_GEX_S4_L001_R1_001.fastq.gz Scell1_GEX_S4_L001_R2_001.fastq.gz Scell1_GEX_S4_L002_R1_001.fastq.gz Scell1_GEX_S4_L002_R2_001.fastq.gz

Command output :

[2020-09-08 21:20:37,450]    INFO Using index Mus_musculus_index.idx to generate BUS file to Scell1 from
[2020-09-08 21:20:37,450]    INFO         /Volumes/wd_elements/Sequencing_data/Singlecell_GEX_2019/Scell1_GEX_S1_L001_R1_001.fastq.gz
[2020-09-08 21:20:37,450]    INFO         /Volumes/wd_elements/Sequencing_data/Singlecell_GEX_2019/Scell1_GEX_S1_L001_R2_001.fastq.gz
[2020-09-08 21:20:37,450]    INFO         /Volumes/wd_elements/Sequencing_data/Singlecell_GEX_2019/Scell1_GEX_S1_L002_R1_001.fastq.gz
[2020-09-08 21:20:37,450]    INFO         /Volumes/wd_elements/Sequencing_data/Singlecell_GEX_2019/Scell1_GEX_S1_L002_R2_001.fastq.gz
[2020-09-08 21:20:37,450]    INFO         /Volumes/wd_elements/Sequencing_data/Singlecell_GEX_2019/Scell1_GEX_S2_L001_R1_001.fastq.gz
[2020-09-08 21:20:37,450]    INFO         /Volumes/wd_elements/Sequencing_data/Singlecell_GEX_2019/Scell1_GEX_S2_L001_R2_001.fastq.gz
[2020-09-08 21:20:37,450]    INFO         /Volumes/wd_elements/Sequencing_data/Singlecell_GEX_2019/Scell1_GEX_S2_L002_R1_001.fastq.gz
[2020-09-08 21:20:37,450]    INFO         /Volumes/wd_elements/Sequencing_data/Singlecell_GEX_2019/Scell1_GEX_S2_L002_R2_001.fastq.gz
[2020-09-08 21:20:37,450]    INFO         /Volumes/wd_elements/Sequencing_data/Singlecell_GEX_2019/Scell1_GEX_S3_L001_R1_001.fastq.gz
[2020-09-08 21:20:37,450]    INFO         /Volumes/wd_elements/Sequencing_data/Singlecell_GEX_2019/Scell1_GEX_S3_L001_R2_001.fastq.gz
[2020-09-08 21:20:37,453]    INFO         /Volumes/wd_elements/Sequencing_data/Singlecell_GEX_2019/Scell1_GEX_S3_L002_R1_001.fastq.gz
[2020-09-08 21:20:37,453]    INFO         /Volumes/wd_elements/Sequencing_data/Singlecell_GEX_2019/Scell1_GEX_S3_L002_R2_001.fastq.gz
[2020-09-08 21:20:37,454]    INFO         /Volumes/wd_elements/Sequencing_data/Singlecell_GEX_2019/Scell1_GEX_S4_L001_R1_001.fastq.gz
[2020-09-08 21:20:37,454]    INFO         /Volumes/wd_elements/Sequencing_data/Singlecell_GEX_2019/Scell1_GEX_S4_L001_R2_001.fastq.gz
[2020-09-08 21:20:37,454]    INFO         /Volumes/wd_elements/Sequencing_data/Singlecell_GEX_2019/Scell1_GEX_S4_L002_R1_001.fastq.gz
[2020-09-08 21:20:37,454]    INFO         /Volumes/wd_elements/Sequencing_data/Singlecell_GEX_2019/Scell1_GEX_S4_L002_R2_001.fastq.gz
[2020-09-08 21:37:05,017]    INFO Sorting BUS file Scell1/output.bus to Scell1/tmp/output.s.bus
[2020-09-08 21:38:26,400]    INFO Whitelist not provided
[2020-09-08 21:38:26,401]    INFO Copying pre-packaged 10XV2 whitelist to Scell1
[2020-09-08 21:38:26,496]    INFO Inspecting BUS file Scell1/tmp/output.s.bus
[2020-09-08 21:39:25,680]    INFO Correcting BUS records in Scell1/tmp/output.s.bus to Scell1/tmp/output.s.c.bus with whitelist Scell1/10xv2_whitelist.txt
[2020-09-08 21:39:46,286]    INFO Sorting BUS file Scell1/tmp/output.s.c.bus to Scell1/output.unfiltered.bus
[2020-09-08 21:40:12,302]    INFO Capturing records from BUS file Scell1/output.unfiltered.bus to Scell1/tmp/spliced.bus with capture list intron_t2c.txt
[2020-09-08 21:41:22,452]    INFO Sorting BUS file Scell1/tmp/spliced.bus to Scell1/spliced.unfiltered.bus
[2020-09-08 21:41:35,562]    INFO Inspecting BUS file Scell1/spliced.unfiltered.bus
[2020-09-08 21:42:24,438]    INFO Generating count matrix Scell1/counts_unfiltered/spliced from BUS file Scell1/spliced.unfiltered.bus
[2020-09-08 21:43:49,732]    INFO Capturing records from BUS file Scell1/output.unfiltered.bus to Scell1/tmp/unspliced.bus with capture list cdna_t2c.txt
[2020-09-08 21:44:52,796]    INFO Sorting BUS file Scell1/tmp/unspliced.bus to Scell1/unspliced.unfiltered.bus
[2020-09-08 21:44:59,056]    INFO Inspecting BUS file Scell1/unspliced.unfiltered.bus
[2020-09-08 21:45:45,018]    INFO Generating count matrix Scell1/counts_unfiltered/unspliced from BUS file Scell1/unspliced.unfiltered.bus
[2020-09-08 21:47:01,560]    INFO Reading matrix Scell1/counts_unfiltered/spliced.mtx
[2020-09-08 21:47:16,627]    INFO Reading matrix Scell1/counts_unfiltered/unspliced.mtx
[2020-09-08 21:47:27,289]    INFO Combining matrices
/Users/jln_home/.pyenv/versions/miniconda3-latest/envs/velocytoenv/lib/python3.8/site-packages/anndata/_core/anndata.py:1094: FutureWarning: is_categorical is deprecated and will be removed in a future version.  Use is_categorical_dtype instead
  if not is_categorical(df_full[k]):
[2020-09-08 21:47:27,687]    INFO Writing matrices to loom Scell1/counts_unfiltered/adata.loom
An exception occurred
Traceback (most recent call last):
  File "/Users/jln_home/.pyenv/versions/miniconda3-latest/envs/velocytoenv/lib/python3.8/site-packages/kb_python/main.py", line 727, in main
    COMMAND_TO_FUNCTION[args.command](parser, args, temp_dir=temp_dir)
  File "/Users/jln_home/.pyenv/versions/miniconda3-latest/envs/velocytoenv/lib/python3.8/site-packages/kb_python/main.py", line 179, in parse_count
    count_velocity(
  File "/Users/jln_home/.pyenv/versions/miniconda3-latest/envs/velocytoenv/lib/python3.8/site-packages/kb_python/count.py", line 1341, in count_velocity
    convert_matrices(
  File "/Users/jln_home/.pyenv/versions/miniconda3-latest/envs/velocytoenv/lib/python3.8/site-packages/kb_python/count.py", line 608, in convert_matrices
    adata.write_loom(loom_path)
  File "/Users/jln_home/.pyenv/versions/miniconda3-latest/envs/velocytoenv/lib/python3.8/site-packages/anndata/_core/anndata.py", line 1891, in write_loom
    write_loom(filename, self, write_obsm_varm=write_obsm_varm)
  File "/Users/jln_home/.pyenv/versions/miniconda3-latest/envs/velocytoenv/lib/python3.8/site-packages/anndata/_io/write.py", line 89, in write_loom
    raise ValueError("loompy does not accept empty matrices as data")
ValueError: loompy does not accept empty matrices as data
[2020-09-08 21:47:27,696]   ERROR An exception occurred
Traceback (most recent call last):
  File "/Users/jln_home/.pyenv/versions/miniconda3-latest/envs/velocytoenv/lib/python3.8/site-packages/kb_python/main.py", line 727, in main
    COMMAND_TO_FUNCTION[args.command](parser, args, temp_dir=temp_dir)
  File "/Users/jln_home/.pyenv/versions/miniconda3-latest/envs/velocytoenv/lib/python3.8/site-packages/kb_python/main.py", line 179, in parse_count
    count_velocity(
  File "/Users/jln_home/.pyenv/versions/miniconda3-latest/envs/velocytoenv/lib/python3.8/site-packages/kb_python/count.py", line 1341, in count_velocity
    convert_matrices(
  File "/Users/jln_home/.pyenv/versions/miniconda3-latest/envs/velocytoenv/lib/python3.8/site-packages/kb_python/count.py", line 608, in convert_matrices
    adata.write_loom(loom_path)
  File "/Users/jln_home/.pyenv/versions/miniconda3-latest/envs/velocytoenv/lib/python3.8/site-packages/anndata/_core/anndata.py", line 1891, in write_loom
    write_loom(filename, self, write_obsm_varm=write_obsm_varm)
  File "/Users/jln_home/.pyenv/versions/miniconda3-latest/envs/velocytoenv/lib/python3.8/site-packages/anndata/_io/write.py", line 89, in write_loom
    raise ValueError("loompy does not accept empty matrices as data")
ValueError: loompy does not accept empty matrices as data
Lioscro commented 4 years ago

When constructing the anndata object, kb "overlays" the spliced and unspliced matrices by taking the intersection of the cells and genes. My guess is that the resulting anndata object is empty because there is no overlapping cells and/or genes between the two matrices.

It would help us debug if you could provide the .mtx, .genes.txt, and .barcodes.txt files for both matrices.

github-actions[bot] commented 4 years ago

This issue is stale because it has been open 30 days with no activity. Remove stale label or comment or this will be closed in 5 days