Open v0dro opened 7 years ago
@v0dro is there a real life reason for this functionality, in the first place?
I recently saw a PR that attempted to remove the line containing block_given?
and the specs passed since there was no test of this sort.
Yes, I understand why we need the test, if functionality exists. But my question is WHY the functionality exists?
Say you have a CSV column that contains dates in the form DD MONTH YEAR
(example 12 february 2016
) and you want to convert this to a DateTime
when you read the file to a dataframe by using your own conversion logic. The easiest way to do it would be by passing a block to from_csv
that can modify the data as it comes.
I process a lot of CSV files and have gotten into the habbit of reading all fields as strings and doing conversions after it's built into a dataframe.
Got it, thanks :+1:
@gnilrets for smaller dataframes and simpler usage scenarios I think passing a block is more readable and straightforward.
I want to work on this. I'm looking for something I can do for GSoC and I think it's a good fit.
@GusAndrianos yes this would be a great and simple issue to start with. Have you had a look at the source code yet? You should hurry up with your proposal since the deadline for submitting the final proposal is 4th April.
@v0dro This wasn't what I had in mind for GSoC so having to submit patches for every organization I am interested in kind of caught me off guard. I'll try to solve this quickly as this is the only thing missing from my proposal. :)
Anything you want me to know before starting?
Well this is a pretty easy patch so I don't think you will require my help for it. Make sure you submit your draft proposal early. A proposal without a patch submission is also fine since we can start evaluating it. You can always add information about the code submission later.
@v0dro That's awesome, I haven't really found anything that fits me better than SciRuby.
I am a bit confused. Can you give a usage example? I'm stuck on this for a while now.
Can we close this (based on discussion on #413)? #428 has been filed removal of block support.
The
DataFrame.from_csv
method currently has a provision for accepting blocks and performing some manipulation on a row that has been read before loading the data into a dataframe.However, there are no tests in
io_spec.rb
for testing this.Tests should amply test error conditions too.