pachterlab / kb_python

A wrapper for the kallisto | bustools workflow for single-cell RNA-seq pre-processing
https://www.kallistobus.tools/
BSD 2-Clause "Simplified" License
151 stars 23 forks source link

Error when creating loom with gene names (instead of gene IDs) #270

Closed mt1022 closed 2 weeks ago

mt1022 commented 1 month ago

Describe the issue I am running the latest version of kb_python to reproduce the example pipeline in the BIVI paper (which was using an older version of kb_python). I generated the loom successfully but the target_name was gene IDs, not gene names. I rerun kb count with the --gene-names argument but failed at loom generation step.

What is the exact command that was run?

kb ref --workflow=nac --verbose --overwrite \
    -i $main_path/indices/human_lamanno.idx -g $main_path/indices/human_lamanno.t2g \
    -c1 $main_path/indices/human_lamanno.mature.t2c -c2 $main_path/indices/human_lamanno.nascent.t2c \
    -f1 $main_path/indices/human.lamanno.mature.fa -f2 $main_path/indices/human.lamanno.nascent.fa \
    gencode46/GRCh38.primary_assembly.genome.fa gencode46/gencode.v46.annotation.gtf

kb count --verbose \
    -i $main_path/indices/human_lamanno.idx \
    -g $main_path/indices/human_lamanno.t2g \
    -x 10xv3 \
    --gene-names \
    -o $main_path/pbmc_1k_v3/ \
    -t 16 -m 60G \
    -c1 $main_path/indices/human_lamanno.mature.t2c \
    -c2 $main_path/indices/human_lamanno.nascent.t2c \
    --workflow nac --filter bustools --overwrite --loom \
    $main_path/pbmc_1k_v3_raw/pbmc_1k_v3_fastqs/pbmc_1k_v3_S1_L001_R1_001.fastq.gz \
    $main_path/pbmc_1k_v3_raw/pbmc_1k_v3_fastqs/pbmc_1k_v3_S1_L001_R2_001.fastq.gz \
    $main_path/pbmc_1k_v3_raw/pbmc_1k_v3_fastqs/pbmc_1k_v3_S1_L002_R1_001.fastq.gz \
    $main_path/pbmc_1k_v3_raw/pbmc_1k_v3_fastqs/pbmc_1k_v3_S1_L002_R2_001.fastq.gz

Command output (with --verbose flag)

[2024-09-14 18:02:26,258]   DEBUG [main] Printing verbose output
[2024-09-14 18:02:28,465]   DEBUG [main] kallisto binary located at /home/admin/miniforge3/envs/sc/bin/kallisto
[2024-09-14 18:02:28,465]   DEBUG [main] bustools binary located at /home/admin/miniforge3/envs/sc/bin/bustools
[2024-09-14 18:02:28,466]   DEBUG [main] Creating `/nfs_data/zhangh/poj/example/pbmc_1k_v3/tmp` directory
[2024-09-14 18:02:28,467]   DEBUG [main] Namespace(list=False, command='count', tmp=None, keep_tmp=False, verbose=True, i='/nfs_data/zhangh/poj/example/indices/human_lamanno.idx', g='/nfs_data/zhangh/poj/example/indices/human_lamanno.t2g', x='10xv3', o='/nfs_data/zhangh/poj/example/pbmc_1k_v3/', num=False, w=None, r=None, t=16, m='60G', strand=None, inleaved=False, genomebam=False, aa=False, gtf=None, chromosomes=None, workflow='nac', em=False, mm=False, tcc=False, filter='bustools', filter_threshold=None, c1='/nfs_data/zhangh/poj/example/indices/human_lamanno.mature.t2c', c2='/nfs_data/zhangh/poj/example/indices/human_lamanno.nascent.t2c', overwrite=True, dry_run=False, batch_barcodes=False, loom=True, h5ad=False, loom_names='barcode,target_name', sum='none', cellranger=False, gene_names=True, N=None, report=False, no_inspect=False, kallisto='kallisto', bustools='bustools', no_validate=False, no_fragment=False, parity=None, fragment_l=None, fragment_s=None, bootstraps=None, matrix_to_files=False, matrix_to_directories=False, fastqs=['/nfs_data/zhangh/poj/example/pbmc_1k_v3_raw/pbmc_1k_v3_fastqs/pbmc_1k_v3_S1_L001_R1_001.fastq.gz', '/nfs_data/zhangh/poj/example/pbmc_1k_v3_raw/pbmc_1k_v3_fastqs/pbmc_1k_v3_S1_L001_R2_001.fastq.gz', '/nfs_data/zhangh/poj/example/pbmc_1k_v3_raw/pbmc_1k_v3_fastqs/pbmc_1k_v3_S1_L002_R1_001.fastq.gz', '/nfs_data/zhangh/poj/example/pbmc_1k_v3_raw/pbmc_1k_v3_fastqs/pbmc_1k_v3_S1_L002_R2_001.fastq.gz'])
[2024-09-14 18:02:30,946]    INFO [count_nac] Using index /nfs_data/zhangh/poj/example/indices/human_lamanno.idx to generate BUS file to /nfs_data/zhangh/poj/example/pbmc_1k_v3/ from
[2024-09-14 18:02:30,946]    INFO [count_nac]         /nfs_data/zhangh/poj/example/pbmc_1k_v3_raw/pbmc_1k_v3_fastqs/pbmc_1k_v3_S1_L001_R1_001.fastq.gz
[2024-09-14 18:02:30,946]    INFO [count_nac]         /nfs_data/zhangh/poj/example/pbmc_1k_v3_raw/pbmc_1k_v3_fastqs/pbmc_1k_v3_S1_L001_R2_001.fastq.gz
[2024-09-14 18:02:30,946]    INFO [count_nac]         /nfs_data/zhangh/poj/example/pbmc_1k_v3_raw/pbmc_1k_v3_fastqs/pbmc_1k_v3_S1_L002_R1_001.fastq.gz
[2024-09-14 18:02:30,946]    INFO [count_nac]         /nfs_data/zhangh/poj/example/pbmc_1k_v3_raw/pbmc_1k_v3_fastqs/pbmc_1k_v3_S1_L002_R2_001.fastq.gz
[2024-09-14 18:02:30,946]   DEBUG [count_nac] kallisto bus -i /nfs_data/zhangh/poj/example/indices/human_lamanno.idx -o /nfs_data/zhangh/poj/example/pbmc_1k_v3/ -x 10xv3 -t 16 /nfs_data/zhangh/poj/example/pbmc_1k_v3_raw/pbmc_1k_v3_fastqs/pbmc_1k_v3_S1_L001_R1_001.fastq.gz /nfs_data/zhangh/poj/example/pbmc_1k_v3_raw/pbmc_1k_v3_fastqs/pbmc_1k_v3_S1_L001_R2_001.fastq.gz /nfs_data/zhangh/poj/example/pbmc_1k_v3_raw/pbmc_1k_v3_fastqs/pbmc_1k_v3_S1_L002_R1_001.fastq.gz /nfs_data/zhangh/poj/example/pbmc_1k_v3_raw/pbmc_1k_v3_fastqs/pbmc_1k_v3_S1_L002_R2_001.fastq.gz
[2024-09-14 18:02:31,047]   DEBUG [count_nac] 
[2024-09-14 18:02:31,047]   DEBUG [count_nac] [bus] Note: Strand option was not specified; setting it to --fr-stranded for specified technology
[2024-09-14 18:03:25,644]   DEBUG [count_nac] [index] k-mer length: 31
[2024-09-14 18:04:07,683]   DEBUG [count_nac] [index] number of targets: 317,156
[2024-09-14 18:04:07,884]   DEBUG [count_nac] [index] number of k-mers: 1,588,671,125
[2024-09-14 18:04:07,884]   DEBUG [count_nac] [index] number of D-list k-mers: 9,821,558
[2024-09-14 18:04:08,286]   DEBUG [count_nac] [quant] will process sample 1: /nfs_data/zhangh/poj/example/pbmc_1k_v3_raw/pbmc_1k_v3_fastqs/pbmc_1k_v3_S1_L001_R1_001.fastq.gz
[2024-09-14 18:04:08,286]   DEBUG [count_nac] /nfs_data/zhangh/poj/example/pbmc_1k_v3_raw/pbmc_1k_v3_fastqs/pbmc_1k_v3_S1_L001_R2_001.fastq.gz
[2024-09-14 18:04:08,286]   DEBUG [count_nac] [quant] will process sample 2: /nfs_data/zhangh/poj/example/pbmc_1k_v3_raw/pbmc_1k_v3_fastqs/pbmc_1k_v3_S1_L002_R1_001.fastq.gz
[2024-09-14 18:04:08,286]   DEBUG [count_nac] /nfs_data/zhangh/poj/example/pbmc_1k_v3_raw/pbmc_1k_v3_fastqs/pbmc_1k_v3_S1_L002_R2_001.fastq.gz
[2024-09-14 18:04:14,808]   DEBUG [count_nac] [quant] finding pseudoalignments for the reads ...
[2024-09-14 18:04:20,028]   DEBUG [count_nac] [progress] 1M reads processed (79.6% mapped)
[2024-09-14 18:04:24,446]   DEBUG [count_nac] [progress] 2M reads processed (79.8% mapped)
[2024-09-14 18:04:29,265]   DEBUG [count_nac] [progress] 3M reads processed (80.1% mapped)
[2024-09-14 18:04:33,884]   DEBUG [count_nac] [progress] 4M reads processed (80.3% mapped)
[2024-09-14 18:04:37,900]   DEBUG [count_nac] [progress] 5M reads processed (80.4% mapped)
[2024-09-14 18:04:41,012]   DEBUG [count_nac] [progress] 6M reads processed (80.5% mapped)
[2024-09-14 18:04:44,927]   DEBUG [count_nac] [progress] 7M reads processed (80.5% mapped)
[2024-09-14 18:04:48,742]   DEBUG [count_nac] [progress] 8M reads processed (80.4% mapped)
[2024-09-14 18:04:51,955]   DEBUG [count_nac] [progress] 9M reads processed (80.4% mapped)
[2024-09-14 18:04:55,770]   DEBUG [count_nac] [progress] 10M reads processed (80.4% mapped)
[2024-09-14 18:04:59,686]   DEBUG [count_nac] [progress] 11M reads processed (80.4% mapped)
[2024-09-14 18:05:03,501]   DEBUG [count_nac] [progress] 12M reads processed (80.5% mapped)
[2024-09-14 18:05:07,316]   DEBUG [count_nac] [progress] 13M reads processed (80.5% mapped)
[2024-09-14 18:05:11,332]   DEBUG [count_nac] [progress] 14M reads processed (80.5% mapped)
[2024-09-14 18:05:15,246]   DEBUG [count_nac] [progress] 15M reads processed (80.5% mapped)
[2024-09-14 18:05:19,162]   DEBUG [count_nac] [progress] 16M reads processed (80.5% mapped)
[2024-09-14 18:05:23,077]   DEBUG [count_nac] [progress] 17M reads processed (80.5% mapped)
[2024-09-14 18:05:26,993]   DEBUG [count_nac] [progress] 18M reads processed (80.4% mapped)
[2024-09-14 18:05:30,908]   DEBUG [count_nac] [progress] 19M reads processed (80.5% mapped)
[2024-09-14 18:05:34,522]   DEBUG [count_nac] [progress] 20M reads processed (80.5% mapped)
[2024-09-14 18:05:38,337]   DEBUG [count_nac] [progress] 21M reads processed (80.5% mapped)
[2024-09-14 18:05:41,751]   DEBUG [count_nac] [progress] 22M reads processed (80.5% mapped)
[2024-09-14 18:05:44,461]   DEBUG [count_nac] [progress] 23M reads processed (80.5% mapped)
[2024-09-14 18:05:48,477]   DEBUG [count_nac] [progress] 24M reads processed (80.5% mapped)
[2024-09-14 18:05:52,091]   DEBUG [count_nac] [progress] 25M reads processed (80.5% mapped)
[2024-09-14 18:05:56,007]   DEBUG [count_nac] [progress] 27M reads processed (80.5% mapped)
[2024-09-14 18:05:59,420]   DEBUG [count_nac] [progress] 28M reads processed (80.5% mapped)
[2024-09-14 18:06:03,235]   DEBUG [count_nac] [progress] 29M reads processed (80.5% mapped)
[2024-09-14 18:06:07,050]   DEBUG [count_nac] [progress] 30M reads processed (80.5% mapped)
[2024-09-14 18:06:10,865]   DEBUG [count_nac] [progress] 31M reads processed (80.5% mapped)
[2024-09-14 18:06:14,780]   DEBUG [count_nac] [progress] 32M reads processed (80.5% mapped)
[2024-09-14 18:06:18,495]   DEBUG [count_nac] [progress] 33M reads processed (80.5% mapped)
[2024-09-14 18:06:22,310]   DEBUG [count_nac] [progress] 34M reads processed (80.5% mapped)
[2024-09-14 18:06:26,125]   DEBUG [count_nac] [progress] 35M reads processed (80.5% mapped)
[2024-09-14 18:06:30,140]   DEBUG [count_nac] [progress] 36M reads processed (80.5% mapped)
[2024-09-14 18:06:33,956]   DEBUG [count_nac] [progress] 37M reads processed (80.6% mapped)
[2024-09-14 18:06:37,470]   DEBUG [count_nac] [progress] 38M reads processed (80.6% mapped)
[2024-09-14 18:06:41,385]   DEBUG [count_nac] [progress] 39M reads processed (80.6% mapped)
[2024-09-14 18:06:44,999]   DEBUG [count_nac] [progress] 40M reads processed (80.6% mapped)
[2024-09-14 18:06:48,913]   DEBUG [count_nac] [progress] 41M reads processed (80.6% mapped)
[2024-09-14 18:06:52,828]   DEBUG [count_nac] [progress] 42M reads processed (80.6% mapped)
[2024-09-14 18:06:56,443]   DEBUG [count_nac] [progress] 43M reads processed (80.6% mapped)
[2024-09-14 18:06:59,956]   DEBUG [count_nac] [progress] 44M reads processed (80.7% mapped)
[2024-09-14 18:07:03,469]   DEBUG [count_nac] [progress] 45M reads processed (80.7% mapped)
[2024-09-14 18:07:07,183]   DEBUG [count_nac] [progress] 46M reads processed (80.7% mapped)
[2024-09-14 18:07:10,395]   DEBUG [count_nac] [progress] 47M reads processed (80.7% mapped)
[2024-09-14 18:07:14,310]   DEBUG [count_nac] [progress] 48M reads processed (80.7% mapped)
[2024-09-14 18:07:18,325]   DEBUG [count_nac] [progress] 49M reads processed (80.7% mapped)
[2024-09-14 18:07:22,139]   DEBUG [count_nac] [progress] 50M reads processed (80.7% mapped)
[2024-09-14 18:07:26,153]   DEBUG [count_nac] [progress] 51M reads processed (80.7% mapped)
[2024-09-14 18:07:30,069]   DEBUG [count_nac] [progress] 53M reads processed (80.7% mapped)
[2024-09-14 18:07:33,884]   DEBUG [count_nac] [progress] 54M reads processed (80.8% mapped)
[2024-09-14 18:07:37,696]   DEBUG [count_nac] [progress] 55M reads processed (80.8% mapped)
[2024-09-14 18:07:41,710]   DEBUG [count_nac] [progress] 56M reads processed (80.8% mapped)
[2024-09-14 18:07:45,422]   DEBUG [count_nac] [progress] 57M reads processed (80.8% mapped)
[2024-09-14 18:07:49,136]   DEBUG [count_nac] [progress] 58M reads processed (80.8% mapped)
[2024-09-14 18:07:52,950]   DEBUG [count_nac] [progress] 59M reads processed (80.8% mapped)
[2024-09-14 18:07:56,564]   DEBUG [count_nac] [progress] 60M reads processed (80.8% mapped)
[2024-09-14 18:08:00,276]   DEBUG [count_nac] [progress] 61M reads processed (80.8% mapped)
[2024-09-14 18:08:03,689]   DEBUG [count_nac] [progress] 62M reads processed (80.9% mapped)
[2024-09-14 18:08:07,603]   DEBUG [count_nac] [progress] 63M reads processed (80.9% mapped)
[2024-09-14 18:08:10,715]   DEBUG [count_nac] [progress] 64M reads processed (80.9% mapped)
[2024-09-14 18:08:14,529]   DEBUG [count_nac] [progress] 65M reads processed (80.9% mapped)
[2024-09-14 18:08:15,733]   DEBUG [count_nac] [progress] 66M reads processed (80.9% mapped)              done
[2024-09-14 18:08:15,733]   DEBUG [count_nac] [quant] processed 66,601,887 reads, 53,889,405 reads pseudoaligned
[2024-09-14 18:08:17,941]   DEBUG [count_nac] 
[2024-09-14 18:08:36,005]    INFO [count_nac] Sorting BUS file /nfs_data/zhangh/poj/example/pbmc_1k_v3/output.bus to /nfs_data/zhangh/poj/example/pbmc_1k_v3/tmp/output.s.bus
[2024-09-14 18:08:36,005]   DEBUG [count_nac] bustools sort -o /nfs_data/zhangh/poj/example/pbmc_1k_v3/tmp/output.s.bus -T /nfs_data/zhangh/poj/example/pbmc_1k_v3/tmp -t 16 -m 60G /nfs_data/zhangh/poj/example/pbmc_1k_v3/output.bus
[2024-09-14 18:09:06,521]   DEBUG [count_nac] partition time: 1.36s
[2024-09-14 18:09:07,023]   DEBUG [count_nac] all fits in buffer
[2024-09-14 18:09:12,042]   DEBUG [count_nac] Read in 53889405 BUS records
[2024-09-14 18:09:12,042]   DEBUG [count_nac] reading time 1.26s
[2024-09-14 18:09:12,042]   DEBUG [count_nac] sorting time 9.06s
[2024-09-14 18:09:12,042]   DEBUG [count_nac] writing time 0.81s
[2024-09-14 18:09:13,945]    INFO [count_nac] On-list not provided
[2024-09-14 18:09:13,945]    INFO [count_nac] Copying pre-packaged 10XV3 on-list to /nfs_data/zhangh/poj/example/pbmc_1k_v3/
[2024-09-14 18:09:14,656]    INFO [count_nac] Inspecting BUS file /nfs_data/zhangh/poj/example/pbmc_1k_v3/tmp/output.s.bus
[2024-09-14 18:09:14,656]   DEBUG [count_nac] bustools inspect -o /nfs_data/zhangh/poj/example/pbmc_1k_v3/inspect.json -w /nfs_data/zhangh/poj/example/pbmc_1k_v3/10x_version3_whitelist.txt /nfs_data/zhangh/poj/example/pbmc_1k_v3/tmp/output.s.bus
[2024-09-14 18:09:24,892]    INFO [count_nac] Correcting BUS records in /nfs_data/zhangh/poj/example/pbmc_1k_v3/tmp/output.s.bus to /nfs_data/zhangh/poj/example/pbmc_1k_v3/tmp/output.s.c.bus with on-list /nfs_data/zhangh/poj/example/pbmc_1k_v3/10x_version3_whitelist.txt
[2024-09-14 18:09:24,892]   DEBUG [count_nac] bustools correct -o /nfs_data/zhangh/poj/example/pbmc_1k_v3/tmp/output.s.c.bus -w /nfs_data/zhangh/poj/example/pbmc_1k_v3/10x_version3_whitelist.txt /nfs_data/zhangh/poj/example/pbmc_1k_v3/tmp/output.s.bus
[2024-09-14 18:09:29,909]   DEBUG [count_nac] Found 6794880 barcodes in the on-list
[2024-09-14 18:09:34,425]   DEBUG [count_nac] Processed 24497393 BUS records
[2024-09-14 18:09:34,425]   DEBUG [count_nac] In on-list = 22882697
[2024-09-14 18:09:34,426]   DEBUG [count_nac] Corrected    = 204718
[2024-09-14 18:09:34,426]   DEBUG [count_nac] Uncorrected  = 1409978
[2024-09-14 18:09:36,732]    INFO [count_nac] Sorting BUS file /nfs_data/zhangh/poj/example/pbmc_1k_v3/tmp/output.s.c.bus to /nfs_data/zhangh/poj/example/pbmc_1k_v3/output.unfiltered.bus
[2024-09-14 18:09:36,732]   DEBUG [count_nac] bustools sort -o /nfs_data/zhangh/poj/example/pbmc_1k_v3/output.unfiltered.bus -T /nfs_data/zhangh/poj/example/pbmc_1k_v3/tmp -t 16 -m 60G /nfs_data/zhangh/poj/example/pbmc_1k_v3/tmp/output.s.c.bus
[2024-09-14 18:10:05,340]   DEBUG [count_nac] partition time: 0.25s
[2024-09-14 18:10:05,541]   DEBUG [count_nac] all fits in buffer
[2024-09-14 18:10:10,159]   DEBUG [count_nac] Read in 23087415 BUS records
[2024-09-14 18:10:10,159]   DEBUG [count_nac] reading time 0.36s
[2024-09-14 18:10:10,159]   DEBUG [count_nac] sorting time 2.94s
[2024-09-14 18:10:10,159]   DEBUG [count_nac] writing time 0.6s
[2024-09-14 18:10:12,064]    INFO [count_nac] Generating count matrix /nfs_data/zhangh/poj/example/pbmc_1k_v3/counts_unfiltered/cells_x_genes from BUS file /nfs_data/zhangh/poj/example/pbmc_1k_v3/output.unfiltered.bus
[2024-09-14 18:10:12,064]   DEBUG [count_nac] bustools count -o /nfs_data/zhangh/poj/example/pbmc_1k_v3/counts_unfiltered/cells_x_genes -g /nfs_data/zhangh/poj/example/indices/human_lamanno.t2g -e /nfs_data/zhangh/poj/example/pbmc_1k_v3/matrix.ec -t /nfs_data/zhangh/poj/example/pbmc_1k_v3/transcripts.txt -s /nfs_data/zhangh/poj/example/indices/human_lamanno.nascent.t2c --genecounts --umi-gene /nfs_data/zhangh/poj/example/pbmc_1k_v3/output.unfiltered.bus
[2024-09-14 18:11:29,708]   DEBUG [count_nac] /nfs_data/zhangh/poj/example/pbmc_1k_v3/counts_unfiltered/cells_x_genes.mature.mtx passed validation
[2024-09-14 18:11:31,726]   DEBUG [count_nac] /nfs_data/zhangh/poj/example/pbmc_1k_v3/counts_unfiltered/cells_x_genes.nascent.mtx passed validation
[2024-09-14 18:11:34,216]   DEBUG [count_nac] /nfs_data/zhangh/poj/example/pbmc_1k_v3/counts_unfiltered/cells_x_genes.ambiguous.mtx passed validation
[2024-09-14 18:11:34,217]    INFO [count_nac] Writing gene names to file /nfs_data/zhangh/poj/example/pbmc_1k_v3/counts_unfiltered/cells_x_genes.genes.names.txt
[2024-09-14 18:11:34,600] WARNING [count_nac] 2103 gene IDs do not have corresponding valid gene names. These genes will use their gene IDs instead.
[2024-09-14 18:11:34,645]    INFO [count_nac] Reading matrix /nfs_data/zhangh/poj/example/pbmc_1k_v3/counts_unfiltered/cells_x_genes.mature.mtx
[2024-09-14 18:11:36,016]    INFO [count_nac] Reading matrix /nfs_data/zhangh/poj/example/pbmc_1k_v3/counts_unfiltered/cells_x_genes.nascent.mtx
[2024-09-14 18:11:38,649]    INFO [count_nac] Reading matrix /nfs_data/zhangh/poj/example/pbmc_1k_v3/counts_unfiltered/cells_x_genes.ambiguous.mtx
[2024-09-14 18:11:41,783]    INFO [count_nac] Combining matrices
[2024-09-14 18:11:42,120]    INFO [count_nac] Writing matrices to loom /nfs_data/zhangh/poj/example/pbmc_1k_v3/counts_unfiltered/adata.loom
[2024-09-14 18:34:20,842]   ERROR [main] An exception occurred
Traceback (most recent call last):
  File "/home/admin/miniforge3/envs/sc/lib/python3.11/site-packages/kb_python/main.py", line 1618, in main
    COMMAND_TO_FUNCTION[args.command](parser, args, temp_dir=temp_dir)
  File "/home/admin/miniforge3/envs/sc/lib/python3.11/site-packages/kb_python/main.py", line 592, in parse_count
    count_nac(
  File "/home/admin/miniforge3/envs/sc/lib/python3.11/site-packages/ngs_tools/logging.py", line 62, in inner
    return func(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^
  File "/home/admin/miniforge3/envs/sc/lib/python3.11/site-packages/kb_python/count.py", line 2061, in count_nac
    convert_result = convert_matrices(
                     ^^^^^^^^^^^^^^^^^
  File "/home/admin/miniforge3/envs/sc/lib/python3.11/site-packages/kb_python/count.py", line 829, in convert_matrices
    adata.write_loom(loom_path)
  File "/home/admin/miniforge3/envs/sc/lib/python3.11/site-packages/anndata/_core/anndata.py", line 1973, in write_loom
    write_loom(filename, self, write_obsm_varm=write_obsm_varm)
  File "/home/admin/miniforge3/envs/sc/lib/python3.11/site-packages/anndata/_io/write.py", line 110, in write_loom
    create(fspath(filename), layers, row_attrs=row_attrs, col_attrs=col_attrs)
  File "/home/admin/miniforge3/envs/sc/lib/python3.11/site-packages/loompy/loompy.py", line 1073, in create
    raise ve
  File "/home/admin/miniforge3/envs/sc/lib/python3.11/site-packages/loompy/loompy.py", line 1065, in create
    ds.ra[key] = vals
    ~~~~~^^^^^
  File "/home/admin/miniforge3/envs/sc/lib/python3.11/site-packages/loompy/attribute_manager.py", line 129, in __setitem__
    return self.__setattr__(name, val)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/admin/miniforge3/envs/sc/lib/python3.11/site-packages/loompy/attribute_manager.py", line 149, in __setattr__
    values = loompy.normalize_attr_values(val, compare_loom_spec_version(self.ds._file, "3.0.0") >= 0)
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/admin/miniforge3/envs/sc/lib/python3.11/site-packages/loompy/normalize.py", line 69, in normalize_attr_values
    arr = normalize_attr_array(a)
          ^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/admin/miniforge3/envs/sc/lib/python3.11/site-packages/loompy/normalize.py", line 47, in normalize_attr_array
    raise ValueError("Argument must be a list, tuple, numpy matrix, numpy ndarray or sparse matrix.")
ValueError: Argument must be a list, tuple, numpy matrix, numpy ndarray or sparse matrix.
[2024-09-14 18:34:20,852]   DEBUG [main] Removing `/nfs_data/zhangh/poj/example/pbmc_1k_v3/tmp` directory

versions

kb info
# kb_python 0.28.2
# kallisto: 0.51.0 (kallisto)
# bustools: 0.44.0 (bustools)
Yenaled commented 1 month ago

—gene-names is no longer really a maintained option in kb-python because gene names are already readily available in the counts_unfiltered directory that can be readily mapped to the corresponding gene IDs.

mt1022 commented 1 month ago

Thanks. I found the gene names file in the counts_unfiltered directory:

pbmc_1k_v3/counts_unfiltered/cells_x_genes.genes.names.txt

However, the gene names in this file are irregular. Here are some lines from this file.

...
WBP1LP7
OR4F29
ENSG00000237094
CICP7
ENSG00000250575
ENSG00000278757.1
ENSG00000293331
ENSG00000235146
MTND1P23
MTND2P28
MTCO1P12
ENSG00000278791
MTCO2P12
MTATP8P1
...

As shown above, some genes have names but were not parsed by kb:

grep 'ENSG00000278757.1' ~/psite_test/gencode46/gencode.v46.annotation.gtf
# chr1  ENSEMBL gene    516376  516479  .   -   .   gene_id "ENSG00000278757.1"; gene_type "snRNA"; gene_name "U6"; level 3;
# chr1  ENSEMBL transcript  516376  516479  .   -   .   gene_id "ENSG00000278757.1"; transcript_id "ENST00000614007.1"; gene_type "snRNA"; gene_name "U6"; transcript_type "snRNA"; transcript_name "U6.90-201"; level 3; transcript_support_level "NA"; tag "basic"; tag "Ensembl_canonical";
# chr1  ENSEMBL exon    516376  516479  .   -   .   gene_id "ENSG00000278757.1"; transcript_id "ENST00000614007.1"; gene_type "snRNA"; gene_name "U6"; transcript_type "snRNA"; transcript_name "U6.90-201"; exon_number 1; exon_id "ENSE00003746310.1"; level 3; transcript_support_level "NA"; tag "basic"; tag "Ensembl_canonical";
github-actions[bot] commented 3 weeks ago

This issue is stale because it has been open 30 days with no activity. Remove stale label or comment or this will be closed in 5 days