msimet / Stile

Stile: the Systematics Tests In Lensing pipeline
BSD 3-Clause "New" or "Revised" License
9 stars 6 forks source link

I/O timing issues #38

Open rmandelb opened 9 years ago

rmandelb commented 9 years ago

I was just noticing as I looked at the TreeCorr Readme file that it uses a package called pandas instead of numpy for fast text file i/o:

This package significantly speeds up the reading of ASCII input catalogs over the numpy functions loadtxt or genfromtxt.

I see that we use genfromtxt. Do you have any sense for how much the file I/O is hitting us for typical uses of Stile? If it's a problem we might consider adding this dependency.

msimet commented 9 years ago

I don't know, but we can look into it. I don't think it would be hard to make this an optional dependency either... On Sep 17, 2014 12:29 PM, "Rachel Mandelbaum" notifications@github.com wrote:

I was just noticing as I looked at the TreeCorr Readme file that it uses a package called pandas instead of numpy for fast text file i/o:

This package significantly speeds up the reading of ASCII input catalogs over the numpy functions loadtxt or genfromtxt.

I see that we use genfromtxt. Do you have any sense for how much the file I/O is hitting us for typical uses of Stile? If it's a problem we might consider adding this dependency.

— Reply to this email directly or view it on GitHub https://github.com/msimet/Stile/issues/38.

rmandelb commented 9 years ago

True. Try/except statements are our friends.

rmandelb commented 9 years ago

Looking at TreeCorr usage of pandas, it looks slightly more complex than genfromtxt:

https://github.com/rmjarvis/TreeCorr/blob/releases/3.1/treecorr/catalog.py

But not too bad. Now that I managed to install TreeCorr I can do it.

msimet commented 9 years ago

Great, thanks!