simonhmartin / genomics_general

General tools for genomic analyses.
329 stars 92 forks source link

Unable to allocate array with shape When running ABBABABAwindows.py #51

Open Yung-Chien opened 3 years ago

Yung-Chien commented 3 years ago

Hi @simonhmartin , I tried to run the script ABBABABAwindows.py to calculate D and fd, while I found this issue Traceback (most recent call last): File "/home1/xx/miniconda2/envs/geno/lib/python3.6/multiprocessing/process.py", line 258, in _bootstrap self.run() File "/home1/xx/miniconda2/envs/geno/lib/python3.6/multiprocessing/process.py", line 93, in run self._target(*self._args, *self._kwargs) File "/home1/xx/software/script/github/genomics_general/ABBABABAwindows.py", line 38, in ABBABABA_wrapper statsDict = genomics.ABBABABA(Aln, P1, P2, P3, O, minData) File "/home1/xx/software/script/github/genomics_general/genomics.py", line 1597, in ABBABABA all4freqs = all4Aln.siteFreqs(sites=goodSites) File "/home1/xx/software/script/github/genomics_general/genomics.py", line 1032, in siteFreqs return np.array([binBaseFreqs(self.numArray[:,x][self.nanMask[:,x]], asCounts=asCounts) for x in sites]) File "/home1/xx/software/script/github/genomics_general/genomics.py", line 1032, in return np.array([binBaseFreqs(self.numArray[:,x][self.nanMask[:,x]], asCounts=asCounts) for x in sites]) File "/home1/xx/software/script/github/genomics_general/genomics.py", line 593, in binBaseFreqs else: return 1. np.bincount(numArr, minlength=4) / n File "<__array_function__ internals>", line 6, in bincount MemoryError: Unable to allocate array with shape (140160792457881,) and data type int64

It seems that a memory error occured, but in a python3 environment. Then I change to a python2 environment and rerun this script, got this error,

File "/home1/xx/software/script/github/genomics_general/ABBABABAwindows.py", line 64 print("Sorter received result", resNumber, file=sys.stderr) ^ SyntaxError: invalid syntax

Ummmm how could I find a solution to make it work. Thanks in advance.

Best regards, Yung-Chien

simonhmartin commented 3 years ago

Hi, I'm not sure what the problem is. Would you be able to share a portion of your .geno file so that I can try to recreate the error locally and figure out the cause? Thanks

simonhmartin commented 3 years ago

Hi, not sure you are still encountering this problem? I recently saw another case with the same error. The cause was the inclusion of the * character in the .geno file. The only solution at the moment is to replace those with N.