chrisclark / PythonForDataScience

PythonForDataScience
154 stars 125 forks source link

File parsing improvements. #1

Closed lexual closed 11 years ago

lexual commented 11 years ago

Hi,

I really enjoyed your tutorial.

Here's a patch that uses Pandas's for file parsing. You might like to update the tutorial with these improvements?

It should parse the csvs 2-3 times faster than numpy. I also think it makes the code more readable.

See:

http://wesmckinney.com/blog/?p=543

http://pandas.pydata.org/

Thanks,

Lex.

chrisclark commented 11 years ago

This is great! I like the n_jobs=-1 trick as well - didn't know about that. Thanks for taking the time to submit the pull request! Merging now...