dereneaton / ipyrad

Interactive assembly and analysis of RAD-seq data sets
http://ipyrad.readthedocs.io
GNU General Public License v3.0
70 stars 39 forks source link

TypeError while using ipyrad.analysis.popgen #560

Closed spilornis-ng closed 2 weeks ago

spilornis-ng commented 2 weeks ago

Hi!

I am getting the following error while running the popgen summary stats tool. Requesting help with this!

Thanks! Naman

Parallel connection | node01: 40 cores
[locus filter] full data: 85789
[locus filter] post filter: 85768
[####################] 100% 0:00:05 | Calculating sumstats for nloci 85768 

Encountered an Error.
Message: TypeError: object of type 'numpy.float64' has no len()
---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
File <string>:1

File ~/miniconda3/envs/ipyrad/lib/python3.10/site-packages/ipyrad/analysis/popgen.py:410, in _calc_sumstats(params, start_lidx, loci)
    407 def _calc_sumstats(params, start_lidx, loci):
    408     # process chunk writes to files and returns proc with features.
    409     proc = Processor(params, start_lidx, loci)
--> 410     proc.run()
    412     with open(proc.outfile, 'wb') as outpickle:
    413         pickle.dump(proc.results, outpickle)

File ~/miniconda3/envs/ipyrad/lib/python3.10/site-packages/ipyrad/analysis/popgen.py:455, in Processor.run(self)
    449 for pop in self.imap:
    450     # Carve off just the samples for this population
    451     try:
    452         # The locus may not have data for all samples in the population
    453         # so `intersection` retains the sample names common to the locus
    454         # index and the samples in the imap pop
--> 455         cts, sidxs, length = self._process_locus(
    456                                     locus.loc[locus.index.intersection(self.imap[pop])])
    457     except KeyError:
    458         raise Exception("Error in Processor.run() lidx: {}".format(lidx))

File ~/miniconda3/envs/ipyrad/lib/python3.10/site-packages/ipyrad/analysis/popgen.py:517, in Processor._process_locus(self, locus)
    514 cts = np.array(locus.apply(lambda bases:\
    515                 Counter(x for x in bases if x not in [45, 78])))
    516 # Only consider variable sites
--> 517 snps = np.array([len(x) for x in cts]) > 1
    518 # Indexes of variable sites
    519 sidxs = np.where(snps)[0]

File ~/miniconda3/envs/ipyrad/lib/python3.10/site-packages/ipyrad/analysis/popgen.py:517, in <listcomp>(.0)
    514 cts = np.array(locus.apply(lambda bases:\
    515                 Counter(x for x in bases if x not in [45, 78])))
    516 # Only consider variable sites
--> 517 snps = np.array([len(x) for x in cts]) > 1
    518 # Indexes of variable sites
    519 sidxs = np.where(snps)[0]

TypeError: object of type 'numpy.float64' has no len()
isaacovercast commented 2 weeks ago

Hi Naman,

I consider the popgen analysis tool to be 'early alpha', it is actually known to be broken and it has never worked well yet, so I would recommend not using it. I would use vcftools, which can get you most of the popgen stats that you might want pretty easily. Hope that helps.

-isaac

spilornis-ng commented 2 weeks ago

Hi Isaac, Thanks for the suggestion. I will proceed with vcftools and other softwares for popgen stats.