ralhei / pyRserve

A python client for Rserve (network layer to remote R server)
Other
47 stars 13 forks source link

Enhanced NA Handling #25

Closed acksmaggart closed 3 years ago

acksmaggart commented 3 years ago

I know I'm wading into controversial water here, but there are cases where R's "NA" value is being converted into something unexpected in python. Specifically:

  1. The first change addresses issue #21. In short, R vectors that contained character values and NA were causing UnicodeDecodeErrors as they were being parsed into numpy arrays. The change in pyRserve/misc.py now has the parser checking for R's NA_character_ specifically and converting it to Python's None.
import pyRserve

conn = pyRserve.connect()

# Old behaivor
conn.eval("c('a', NA, 'b')") # => UnicodeDecodeError: 'utf-8' codec can't decode byte 0xff in position 0: invalid start byte

# New behaivor
conn.eval("c('a', NA, 'b')") # => array(['a', None, 'b'], dtype=object)
  1. Converting an R vector that contains booleans and NA produces a python numpy array of integers and None. The None is expected, but the integers are a surprise. The change in pyRserve/rparser.py produces a numpy array of booleans and None.
import pyRserve

conn = pyRserve.connect()

# Old behaivor
conn.eval("c(TRUE, NA, FALSE)") # => array([1, None, 0], dtype=object)

# New behaivor
conn.eval("c(TRUE, NA, FALSE)") # => array([True, None, False], dtype=object)
acksmaggart commented 3 years ago

As soon as I can get the tests to run I'll mark is as "Ready for review". Rserve on my machine must have been installed differently than on yours because the testtools.py module isn't detecting the path correctly.

ralhei commented 3 years ago

Hi Max, did you get any progress on this issue? R

acksmaggart commented 3 years ago

Hey, @ralhei . Sorry for the wait. I have made the two changes you suggested and re-based onto master, let me know if anything still looks squirrelly. And thanks for the package!

ralhei commented 3 years ago

Hi Max, thanks for your changes. However it is never a good idea to merge changes from master into a working tree because the history tree really gets messed up. So please (in the future) always 'rebase' your working tree onto master and avoid a merge. I'll accept your change anyway, just to get ahead with everything. Cheers, Ralph

ralhei commented 3 years ago

Sorry, changed my mind, because changes from Chad and myself would be part of the pull request.

Before running the next command please create a patch for your changes, just to be safe ;-)

Then the command to get rid of the merge-commit in your history is to run git rebase --onto 62c4d36e^ 62c4d36e HEAD Finally push (possibly with --force) again. We hopefully then have a clean pull request.

ralhei commented 3 years ago

Hi Max, in order to get this PR finally done I've finally merge your code myself in the proposed way. Due to that I will now close the PR. Thanks a lot for your contribution.