TidierOrg / Tidier.jl

Meta-package for data analysis in Julia, modeled after the R tidyverse.
MIT License
515 stars 14 forks source link

Implementing functions proposed in #48 #83

Closed alonsoC1s closed 1 year ago

alonsoC1s commented 1 year ago

Implements as_float, as_integer, as_string and adds tests for the added functionality as mentioned in #48 .

kdpsingh commented 1 year ago

Thank you for this! I'm planning to move the tests to docstrings and am happy to take it from here.

alonsoC1s commented 1 year ago

Excelent! I can still help getting this ready to merge by changing the test, checking why the build fails, etc.

kdpsingh commented 1 year ago

I think the build failing is an easy problem to fix - my guess is we just need to export the functions.

I'm going to move the tests to docstrings (for consistency with the rest of the package) and add a documentation page.

It should be straightforward. Since there's no official style guide yet, this is my chance to keep things consistent for now.

lazarusA commented 1 year ago

maybe try to avoid using catch and try and use something more like a ismissing conditional instead.

kdpsingh commented 1 year ago

That's a good suggestion, @lazarusA.

That approach may work for convert() (one number to another) but likely not for parse() in the way as_float() (technically as_numeric()) is implemented in R. For example, as_float("hello") should return a missing value and not an error (to mimic R behavior) whereas as_float("1.1") should return 1.1.

I also need to see how float-to-integer conversion works in R (I think it truncates).

While it's not good practice from a Julia standpoint, I suspect we will need to use try/catch or tryparse() (followed by conversation of nothing to missing). Given that complexity, I think the proposed approach is reasonable for now.

Going to check out a few edge cases in R before merging this.

alonsoC1s commented 1 year ago

I'm sorry, I'm struggling to see why we should avoid try blocks. Is there any particular reason?

kdpsingh commented 1 year ago

I think there are 2 philosophies of error handling: 1. look before you leap (anticipate errors and handle proactively) and 2. easier to ask for forgiveness (try/catch).

Both are described here: https://scls.gitbooks.io/ljthw/content/_chapters/11-ex8.html

Culturally, I think the first approach is considered more "correct" by many in the Julia community than the second (which is commonly used in R/Python).

There may potentially be other advantages to the first (in terms of compiler-friendliness/efficiency/speed) but I'm quite used to try/catch and don't see a major issue with it as you've used it here (and it may be unavoidable without anticipating lots of potential contingencies).

lazarusA commented 1 year ago

those functions already have ::Union{AbstractFloat, Missing}, so you already have only 2 options. Honestly, I don't see the need for try and catch if this is already settled from the start.

kdpsingh commented 1 year ago

I will try to remove and see if the behavior changes. My worry is that parse(Float, "hello") will generate an error instead of returning a missing value unless we wrap in try/catch.

I'm not sure if specifying the return type will fix that behavior.

Will try and see how it goes (am still new to Julia so truly not sure how it will behave).