Hi,
This is probably related to issues #13 and #12 but since there are no helpful answers to those I have opened a new one. I am running DiscoverY for what is probably a large dataset (though the genome size is ~1.2Gb, so smaller than human), and I keep getting a python MemoryError:
Started DiscoverY
Mode female+male
Using default of k=25 and input folder='data'
Shortlisting Y-contigs
Need to make Bloom Filter of k-mers from female
Done creating bloom filter
Generating a dictionary from kmers in kmers_from_male_reads
Traceback (most recent call last):
File "discoverY.py", line 70, in <module>
main()
File "discoverY.py", line 65, in main
classify_ctgs.classify_ctgs(k_size, bloom_filt, bf_capacity, female_kmers, mode)
File "/scratch/24769731/DiscoverY/scripts/classify_ctgs.py", line 143, in classify_ctgs
classify_fm_male_mode(kmer_size, female_kmers_bf)
File "/scratch/24769731/DiscoverY/scripts/classify_ctgs.py", line 52, in classify_fm_male_mode
kmer_abundance_dict_from_male = kmers.make_dict_from_kmer_abundance(reads_kmers, kmer_size)
File "/scratch/24769731/DiscoverY/scripts/kmers.py", line 44, in make_dict_from_kmer_abundance
kmer_dicts[line[:kmer_size]] = current_abundance
MemoryError
Hi, This is probably related to issues #13 and #12 but since there are no helpful answers to those I have opened a new one. I am running DiscoverY for what is probably a large dataset (though the genome size is ~1.2Gb, so smaller than human), and I keep getting a python MemoryError:
The input files end up quite large:
But I have requested compute resources with 1TB of RAM and the usage states say that the job uses a maximum of 600GB.
Any help would be appreciated.