brian-cleary / LatentStrainAnalysis

Partitioning and analysis methods for large, complex sequence datasets
MIT License
37 stars 20 forks source link

hashing count issue #20

Open igsbma opened 7 years ago

igsbma commented 7 years ago

Hi there, I am running hashingCount OK on test data, but failed on my own fastq file. To me it seems a fastq format issue. Can you advice?

Thank you!

Traceback (most recent call last): File "LSA/create_hash.py", line 31, in hashobject.rand_kmers_for_wheel(total_rand_kmers) File "/local/projects-t2/M8910/software/LatentStrainAnalysis/LSA/fastq_reader.py", line 140, in rand_kmers_for_wheel kmers_per_file = max(total_kmers/len(RP),5) ZeroDivisionError: integer division or modulo by zero

I copied 4 lines of the fastq file here. @SN180:316:D14VNACXX:7:1101:10000:100204/1 N:0:TTCAACC TAATGTATAAGGATATGAAGGTTGGTGATTTGCCACAAGGCACCAAATTCAAACTTAAGCAATATAAAAAAGACGATAATCATAAAGTTATCCCATGGTCA + CCCFFFFFHHHHHJJJJJJJJGHJJGIIJJJJJJJJJJJJIJIJJJJIJIJJJJJJJJJJJJJJIJIIIJHHHFFCDDDDDDDDDDDDDEEDDDDDDDDDC

brian-cleary commented 7 years ago

The problem seems to be that looking for fastq files that it cannot find. Either one of the following is expected to return a list of fastq files:

fastq_reader.py line 133 RP = glob.glob(os.path.join(self.inputpath,'.fastq._'))

fastq_reader.py line 139 RP = glob.glob(os.path.join(self.input_path,'*.fastq'))

Here "self.input_path" is usually "original_reads/", as in the following command:

python LSA/create_hash.py -i original_reads/ -o hashed_reads/ -k 33 -s $hashSize > Logs/CreateHash.log 2>&1

Are you sure there are fastq files in the specified input directory?

On Tue, Oct 18, 2016 at 11:18 PM bma notifications@github.com wrote:

Hi there, I am running hashingCount OK on test data, but failed on my own fastq file. To me it seems a fastq format issue. Can you advice?

Thank you!

Traceback (most recent call last): File "LSA/create_hash.py", line 31, in hashobject.rand_kmers_for_wheel(total_rand_kmers) File "/local/projects-t2/M8910/software/LatentStrainAnalysis/LSA/fastq_reader.py", line 140, in rand_kmers_for_wheel kmers_per_file = max(total_kmers/len(RP),5) ZeroDivisionError: integer division or modulo by zero

I copied 4 lines of the fastq file here. @SN180:316:D14VNACXX:7:1101:10000:100204/1 N:0:TTCAACC

TAATGTATAAGGATATGAAGGTTGGTGATTTGCCACAAGGCACCAAATTCAAACTTAAGCAATATAAAAAAGACGATAATCATAAAGTTATCCCATGGTCA +

CCCFFFFFHHHHHJJJJJJJJGHJJGIIJJJJJJJJJJJJIJIJJJJIJIJJJJJJJJJJJJJJIJIIIJHHHFFCDDDDDDDDDDDDDEEDDDDDDDDDC

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/brian-cleary/LatentStrainAnalysis/issues/20, or mute the thread https://github.com/notifications/unsubscribe-auth/ACLD5lpeATWZ5vis6UWp3HY4ulNyG5iVks5q1YwMgaJpZM4KahP- .

igsbma commented 7 years ago

oh I see, I used .fq instead of .fastq...

Thank you!