cytham / telomap

A tool to analyze telomeric reads from WGS or telobait-capture long-read sequencing data
GNU General Public License v3.0
9 stars 2 forks source link

command line arguments not parsing correctly #2

Open santiago-es opened 10 months ago

santiago-es commented 10 months ago

Hi there,

I've sequenced some samples with Pacbio following telomere enrichment as described in your paper and am trying to analyze the data.

I attempted: telomap [reads.fa] [capture_oligo.fa] [barcodes.fa] [data_type] [no_cores] [working_directory]

as listed in the README but this fails and produces the error message. The error message appears to be prompted by the number of arguments on the command line, but the indexing seems to be off by 1. I edited the indexing in my own install and it runs into another error, but could you clarify the appropriate usage of telomap?

cytham commented 10 months ago

Hi @santiago-es. your usage seems correct. Can you please show me the error messages, from the supposed index off and the other error?

santiago-es commented 10 months ago
telomap m84085_231031_213744_s2.hifi_reads.bc1008.bam capture.fa barcodes.fa bam 16 work/
Input Error - usage: telomap reads.bam capture_oligo.fa barcodes.fa data_type no_cores working_directory

this is the command I used and error, with the files I created for the capture oligos and barcodes, attempting to run with 16 threads and using the directory ./work/ as a working directory. If you edit telomap for the above error message to appear if len(argv) != 7 instead of != 5, the error message becomes:

[13/11/2023 11:51:59] - Telomap started
Traceback (most recent call last):
  File "/.miniconda/bin/telomap", line 118, in <module>
    main()
  File "/.miniconda/bin/telomap", line 33, in main
    out = TeloMap(read_path, oligo_path, barcode_path, data_type, cores, tsv_header=True)
  File "/.miniconda/lib/python3.9/site-packages/telomap/integrate.py", line 22, in __init__
    self.barcodes = self.parse_fasta(barcode_path)
  File "/.miniconda/lib/python3.9/site-packages/telomap/integrate.py", line 52, in parse_fasta
    seq = next(f).strip()
StopIteration
cytham commented 10 months ago

Thanks for correcting the argv input error. I will make that change in the next version release. For the other error, can you please check if your barcode.fa file has the correct fasta format? Possible that you send it to me?

santiago-es commented 10 months ago

there's only two barcodes in my barcode file. the format of the file is in fasta format, like so:

>bc1008
SEQUENCE HERE
>bc1009 
SEQUENCE HERE

I got the sequences from the PB chemistry docs

cytham commented 10 months ago

Do you mind sending this file here?

santiago-es commented 10 months ago

barcodes.txt capture.txt Sure. Github doesnt support uploading .fa files so I copied them into .txt but the contents should be identical.

cytham commented 10 months ago

Hi @santiago-es, thanks for the files. It looks like the problem lies with the last line of the barcodes.fa file which is blank.

There is an additional newline at the end of your barcodes.fa file. >bc1008\nCGCAGCGCTCGACTGT\n>bc1009\nTCTGTCTCGCGTGTGT\n\n

I have introduced a quick fix in the latest commit, you should be able to run with the same files now, please clone and try again, thanks

santiago-es commented 9 months ago

Hi @cytham ! Thanks for those commits. Trying this again for this year and telomap ran as expected, but there was trouble piping to output. Got this error:

Traceback (most recent call last): File "/home/artandi/data/.miniconda/bin/telomap", line 116, in main() File "/home/artandi/data/.miniconda/bin/telomap", line 31, in main out = TeloMap(read_path, oligo_path, barcode_path, data_type, cores, tsv_header=True) File "/home/artandi/data/.miniconda/lib/python3.9/site-packages/telomap/integrate.py", line 25, in init read_to_cluster, self.df_anchors = self.cluster_telomeres() File "/home/artandi/data/.miniconda/lib/python3.9/site-packages/telomap/integrate.py", line 38, in cluster_telomeres clust = SubTeloClust(self.read_fasta, self.barcode_reads, self.chm13_path, self.cores) File "/home/artandi/data/.miniconda/lib/python3.9/site-packages/telomap/cluster.py", line 42, in init self.multi_process_cluster(barcodes) File "/home/artandi/data/.miniconda/lib/python3.9/site-packages/telomap/cluster.py", line 404, in multi_process_cluster for w in range(number_of_processes): TypeError: 'str' object cannot be interpreted as an integer

Looks like something might be mis-typed somewhere? This was the command I used to run:

telomap bc1008.bam capture.fa barcodes.fa pacbio-bam 16 work/

I've already fixed the barcodes.fa file (thanks for pointing out the phantom \n)

cytham commented 6 months ago

@santiago-es Sorry for missing this. I'm not sure how you got that error, maybe a typo in the older version. Could you try the latest version again? thanks