Open fernando-pierucci-alves opened 4 weeks ago
?
Hi Yenaled,
I made a mistake in sending before completing. Again:
I am executing this command as per 'box 2' in Nat. Biot. 2024 article: ~$ kb ref -i fpa_index.idx -g fpa_t2g.txt -f1 fpa_cdna.fasta cds_from_genomic.fasta genomic.gtf
The fasta and gtf are the latest human NCBI files.
At the end of the indexing process, I am getting this output in terminal:
4. [2024-10-30 13:18:19,235] WARNING [main] Gene
KIR3DP1_34has no transcripts. The entire gene will be marked as a transcript and an exon with ID
KIR3DP1_34. [2024-10-30 13:18:19,235] WARNING [main] Gene
KIR2DP1_36has no transcripts. The entire gene will be marked as a transcript and an exon with ID
KIR2DP1_36`.
[2024-10-30 13:18:23,996] INFO [ref] Splitting genome cds_from_genomic.fasta into cDNA at /home/fpa6/tmp/tmpxjhvjtxe
[2024-10-30 14:53:01,887] INFO [ref] Concatenating 1 cDNAs to fpa_cdna.fasta
[2024-10-30 14:53:01,887] INFO [ref] Creating transcript-to-gene mapping at fpa_t2g.txt
[2024-10-30 14:53:01,888] INFO [ref] Indexing fpa_cdna.fasta to fpa_index.idx
[2024-10-30 14:53:02,989] ERROR [ref]
Warning: you asked for 8, but only 4 cores on the machine
[build] loading fasta file fpa_cdna.fasta
[build] k-mer length: 31
KmerStream::KmerStream(): Input file tmp/kallisto.XEaFUK6CL0C03cU43RFyOti does not exist, is ill-formed or is not in FASTA/FASTQ/GFA format.
[2024-10-30 14:53:02,989] ERROR [main] An exception occurred
Traceback (most recent call last):
File "/home/fpa6/venv_fpa_kbpython/lib/python3.12/site-packages/kb_python/main.py", line 1926, in main
COMMAND_TO_FUNCTION[args.command](parser, args, temp_dir=temp_dir)
File "/home/fpa6/venv_fpa_kbpython/lib/python3.12/site-packages/kb_python/main.py", line 359, in parse_ref
ref(
File "/home/fpa6/venv_fpa_kbpython/lib/python3.12/site-packages/ngs_tools/logging.py", line 62, in inner
return func(*args, *kwargs)
^^^^^^^^^^^^^^^^^^^^^
File "/home/fpa6/venv_fpa_kbpython/lib/python3.12/site-packages/kb_python/ref.py", line 689, in ref
) if n > 1 else kallisto_index(
^^^^^^^^^^^^^^^
File "/home/fpa6/venv_fpa_kbpython/lib/python3.12/site-packages/kb_python/ref.py", line 291, in kallisto_index
run_executable(command)
File "/home/fpa6/venv_fpa_kbpython/lib/python3.12/site-packages/kb_python/dry/init.py", line 25, in inner
return func(args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^
File "/home/fpa6/venv_fpa_kbpython/lib/python3.12/site-packages/kb_python/utils.py", line 203, in run_executable
raise sp.CalledProcessError(p.returncode, ' '.join(command))
subprocess.CalledProcessError: Command '/home/fpa6/venv_fpa_kbpython/lib/python3.12/site-packages/kb_python/bins/linux/kallisto/kallisto index -i fpa_index.idx -k 31 -t 8 -d cds_from_genomic.fasta fpa_cdna.fasta' returned non-zero exit status 1.
The three output files are produced but they are empty.
Can you please provide insight? Thank you in advance and congrats on this overall scientific work.
Fernando
cds_from_genomic.fasta is not the whole genome; it's just the CDS. You need to supply a whole genome FASTA.
The article states...
"The default is --workflow=standard, which creates an index suitable for bulk and single-cell RNA-seq quantification. It creates an index built from only the coding DNA sequences (the usage of coding DNA here follows that of Ensembl30, i.e., the sequences of the mature transcripts wherein introns are not included as they have been spliced out)."
Am I misinterpreting the above? Thanks.
It builds an index from coding sequences; that doesn’t mean you supply the coding sequences. kb ref extracts the coding sequences for you from the genomic FASTA. That’s why the paper says “For both the standard and nac index types, a user supplies a genome FASTA and GTF annotation, which kb-python uses to extract the relevant sequences”.
That should solve the problem. Thanks again,
FPA
Describe the issue I am executing this command as per 'box 2' in Nat. Biot. 2024 article: ~$ kb ref -i fpa_index.idx -g fpa_t2g.txt -f1 fpa_cdna.fasta cds_from_genomic.fasta genomic.gtf
What is the exact command that was run?
Command output (with
--verbose
flag)