matted-zz / multipool

High-resolution genetic mapping for pooled sequencing.
http://cgs.csail.mit.edu/multipool/
MIT License
9 stars 12 forks source link

ValueError: Unable to create correctly shaped tuple from inf #9

Open jayoung opened 7 years ago

jayoung commented 7 years ago

Hi there,

Thanks very much for multipool - I think it'll be really useful for us.

We have a yeast dataset we're trying to run it on. It runs fine on most chromosomes, but with three chromosomes, we get an error - I've pasted the full output below, but the last line is this: "ValueError: Unable to create correctly shaped tuple from inf". I'm guessing there's something odd about the data we have for this chromosome that's breaking multipool. Doing some plots of our own doesn't show anything very obvious yet, but we're in the early days of exploring this dataset.

I know I'm using newer versions of the dependencies than you list on the wiki page. At first I thought maybe that was the problem, so I have also tried setting up a virtual environment where I install the same versions listed on the multipool wiki page ("python 2.7.3, scipy 0.9.0, numpy 1.6.1, and matplotlib 1.1.1rc are used."). Is matplotlib 1.1.1rc the same as matplotlib 1.1.1? I'm not clear on where to get the rc version, if that's different. Anyway, using that virtual environment, I haven't managed to get mp_inference.py to run even on your test files. Again, I've pasted the full error below, but the last couple of lines look like this:
File "/home/jayoung/malik_lab_shared/linux_gizmo/bin/mp_inference.py", line 66, in load_table variances = numpy.full(len(bin_starts), numpy.inf) AttributeError: 'module' object has no attribute 'full' When I google that, it seems to suggest that numpy.full function isn't in v 1.6.1 of numpy - can you clarify the package versions you're using?

I'm happy to share the data files that seem to generate this error, if you're able to take a look at them? Hope this isn't too much of a pain to deal with.

Thanks very much,

Janet Young


Dr. Janet Young

Malik lab http://research.fhcrc.org/malik/en.html

Division of Basic Sciences Fred Hutchinson Cancer Research Center 1100 Fairview Avenue N., A2-025, P.O. Box 19024, Seattle, WA 98109-1024, USA.

tel: (206) 667 4512 email: jayoung ...at... fhcrc.org


error with our default package versions:

Multipool version: 0.10.2 Python version: 2.7.6 (default, Jun 22 2015, 17:58:13) [GCC 4.8.2] Scipy version: 0.13.3 Numpy version: 1.8.2 Matplotlib version: 1.3.1 Recombination fraction: 0.00030303030303 in cM: 3300.0 cutoff: 497.0 Filtering allele counts: [ 359. 189.] Filtering allele counts: [ 371. 157.] Filtering allele counts: [ 260. 238.] Filtering allele counts: [ 365. 340.] Filtering allele counts: [ 293. 269.] Filtering allele counts: [ 243. 302.] Filtering allele counts: [ 400. 268.] Filtering allele counts: [ 443. 252.] Filtering allele counts: [ 564. 269.] Filtering allele counts: [ 290. 231.] Filtering allele counts: [ 220. 310.] Filtering allele counts: [ 327. 195.] Filtering allele counts: [ 320. 241.] Filtering allele counts: [ 584. 310.] Filtering allele counts: [ 261. 246.] Filtering allele counts: [ 211. 344.] Filtering allele counts: [ 268. 292.] Filtering allele counts: [ 243. 274.] Filtering allele counts: [ 271. 365.] Filtering allele counts: [ 384. 457.] cutoff: 509.0 Filtering allele counts: [ 274. 385.] Filtering allele counts: [ 217. 301.] Filtering allele counts: [ 294. 356.] Filtering allele counts: [ 309. 363.] Filtering allele counts: [ 228. 282.] Filtering allele counts: [ 316. 368.] Filtering allele counts: [ 277. 376.] Filtering allele counts: [ 263. 572.] Filtering allele counts: [ 111. 420.] Traceback (most recent call last): File "/home/jayoung/malik_lab_shared/linux_gizmo/bin/mp_inference.py", line 459, in y, y_var, y2, y_var2, d, d2, T, bins = doLoading(args.fins, args.filter) File "/home/jayoung/malik_lab_shared/linux_gizmo/bin/mp_inference.py", line 232, in doLoading y_var = numpy.pad(y_var, pad_widths, 'constant', constant_values=numpy.inf) File "/usr/lib/python2.7/dist-packages/numpy/lib/arraypad.py", line 1320, in pad kwargs[i] = _normalize_shape(narray, kwargs[i]) File "/usr/lib/python2.7/dist-packages/numpy/lib/arraypad.py", line 1044, in _normalize_shape raise ValueError(fmt % (shape,)) ValueError: Unable to create correctly shaped tuple from inf

error when I try to use virtual env

Multipool version: 0.10.2 Python version: 2.7.3 (default, Jan 21 2016, 14:53:14) [GCC 4.8.4] Scipy version: 0.9.0 Numpy version: 1.6.1 Matplotlib version: 1.1.1 Recombination fraction: 0.00030303030303 in cM: 3300.0 cutoff: 300.0 Filtering allele counts: [ 186. 128.] Filtering allele counts: [ 200. 138.] Filtering allele counts: [ 198. 187.] Filtering allele counts: [ 87. 248.] Filtering allele counts: [ 161. 163.] Filtering allele counts: [ 140. 192.] Filtering allele counts: [ 273. 55.] Filtering allele counts: [ 1083. 2086.] Filtering allele counts: [ 189. 165.] Filtering allele counts: [ 271. 92.] Filtering allele counts: [ 60. 276.] Filtering allele counts: [ 260. 207.] Filtering allele counts: [ 122. 184.] Filtering allele counts: [ 276. 59.] Filtering allele counts: [ 125. 177.] Filtering allele counts: [ 108. 240.] Filtering allele counts: [ 217. 148.] Filtering allele counts: [ 244. 147.] Filtering allele counts: [ 169. 144.] Filtering allele counts: [ 160. 142.] Filtering allele counts: [ 96. 221.] Filtering allele counts: [ 240. 218.] Filtering allele counts: [ 204. 122.] Filtering allele counts: [ 179. 172.] Filtering allele counts: [ 267. 89.] Filtering allele counts: [ 191. 161.] Filtering allele counts: [ 149. 169.] Filtering allele counts: [ 235. 112.] Filtering allele counts: [ 152. 169.] Filtering allele counts: [ 153. 157.] Filtering allele counts: [ 213. 129.] Filtering allele counts: [ 227. 118.] Filtering allele counts: [ 192. 250.] Filtering allele counts: [ 154. 165.] Filtering allele counts: [ 189. 236.] Traceback (most recent call last): File "/home/jayoung/malik_lab_shared/linux_gizmo/bin/mp_inference.py", line 459, in y, y_var, y2, y_var2, d, d2, T, bins = doLoading(args.fins, args.filter) File "/home/jayoung/malik_lab_shared/linux_gizmo/bin/mp_inference.py", line 201, in doLoading y,y_var,d, bins = load_table(fins[0], res, False, filt) File "/home/jayoung/malik_lab_shared/linux_gizmo/bin/mp_inference.py", line 66, in load_table variances = numpy.full(len(bin_starts), numpy.inf) AttributeError: 'module' object has no attribute 'full'

matted-zz commented 7 years ago

Thanks! This is puzzling; I'll have to look into it more closely. As a possible workaround, would you mind trying one of the previous releases (accessible here or via git clone --branch v0.10.2 https://github.com/matted/multipool.git)? From a quick glance it seems like the problem is arising in new code added after the 0.10.2 release.

jayoung commented 7 years ago

Thanks for the quick reply, Matt. Yes, the v0.10.2 release version runs fine on our three problem chromosomes (and I tested v0.10.1 on one chromosome - that looks good too).

We'll go with the release version - that's great. If you'd like a problem dataset for further testing, let me know.