SciRuby / daru

Data Analysis in RUby
BSD 2-Clause "Simplified" License
1.04k stars 139 forks source link

Add tests for testing block_given? functionality of DataFrame.from_csv() method. #308

Open v0dro opened 7 years ago

v0dro commented 7 years ago

The DataFrame.from_csv method currently has a provision for accepting blocks and performing some manipulation on a row that has been read before loading the data into a dataframe.

However, there are no tests in io_spec.rb for testing this.

Tests should amply test error conditions too.

zverok commented 7 years ago

@v0dro is there a real life reason for this functionality, in the first place?

v0dro commented 7 years ago

I recently saw a PR that attempted to remove the line containing block_given? and the specs passed since there was no test of this sort.

zverok commented 7 years ago

Yes, I understand why we need the test, if functionality exists. But my question is WHY the functionality exists?

v0dro commented 7 years ago

Say you have a CSV column that contains dates in the form DD MONTH YEAR (example 12 february 2016) and you want to convert this to a DateTime when you read the file to a dataframe by using your own conversion logic. The easiest way to do it would be by passing a block to from_csv that can modify the data as it comes.

gnilrets commented 7 years ago

I process a lot of CSV files and have gotten into the habbit of reading all fields as strings and doing conversions after it's built into a dataframe.

zverok commented 7 years ago

Got it, thanks :+1:

v0dro commented 7 years ago

@gnilrets for smaller dataframes and simpler usage scenarios I think passing a block is more readable and straightforward.

gusandrianos commented 7 years ago

I want to work on this. I'm looking for something I can do for GSoC and I think it's a good fit.

v0dro commented 7 years ago

@GusAndrianos yes this would be a great and simple issue to start with. Have you had a look at the source code yet? You should hurry up with your proposal since the deadline for submitting the final proposal is 4th April.

gusandrianos commented 7 years ago

@v0dro This wasn't what I had in mind for GSoC so having to submit patches for every organization I am interested in kind of caught me off guard. I'll try to solve this quickly as this is the only thing missing from my proposal. :)

Anything you want me to know before starting?

v0dro commented 7 years ago

Well this is a pretty easy patch so I don't think you will require my help for it. Make sure you submit your draft proposal early. A proposal without a patch submission is also fine since we can start evaluating it. You can always add information about the code submission later.

gusandrianos commented 7 years ago

@v0dro That's awesome, I haven't really found anything that fits me better than SciRuby.

gusandrianos commented 7 years ago

I am a bit confused. Can you give a usage example? I'm stuck on this for a while now.

parthm commented 6 years ago

Can we close this (based on discussion on #413)? #428 has been filed removal of block support.