trevorld / r-argparse

command-line optional and positional argument parser
GNU General Public License v2.0
103 stars 11 forks source link

integer 'type' converted to doubles (due to relying on JSON serialization to pass data from python to R) #21

Closed DominikMueller64 closed 6 years ago

DominikMueller64 commented 6 years ago

I noticed that the coercion to type integer seems not to be properly working.

p <- argparse::ArgumentParser()
p$add_argument('--int', type='integer')
p$add_argument('--double', type='double')
p$add_argument('--character', type='character')

input <- '1'
args <- p$parse_args(c('--int', input,
                       '--double', input,
                       '--character', input))
sapply(args, typeof)

gives

character      double         int 
"character"    "double"    "double"

whereas the type of int should be integer.

trevorld commented 6 years ago

Thanks for the bug report. I confirm that type='integer' is currently being cast to double.

This is because I am currently using JSON to serialize the parsed arguments from python's argparse module in order to import them to R and it seems that JSON doesn't distinguish integers from floats and hence during the exchange all integer/doubles are cast into doubles. YAML seems to distinguish between integers/doubles but YAML isn't supported by Python's standard library and I don't think I want to complicate the external dependencies of the package further by requiring a non-standard-library python package. I'll try to find another serialization protocal that is included in python's standard library that is also easily parsed by an R package.

DominikMueller64 commented 6 years ago

I think using JSON is ok, but I suspect the problem is that rjson does not properly convert doubles. Here is a small example:

In python:

import json
data = {'int': 1, 'float': 1.1, 'string': 'foo', 'bool': True}
with open('test.json', mode='w', encoding='utf-8') as f:
    json.dump(data, f)

with open('test.json', mode='r', encoding='utf-8') as f:
    data = json.load(f)

for k, v in data.items():
    print('{}: {!s}'.format(k, type(v)))

gives

int: <class 'int'>
float: <class 'float'>
string: <class 'str'>
bool: <class 'bool'>

And reading in the stuff in R with rjson

sapply(rjson::fromJSON(file = 'test.json'), typeof)

yields

   int       float      string        bool 
"double"    "double" "character"   "logical"

But if we do it with jsonlite

sapply(jsonlite::fromJSON(txt = 'test.json'), typeof)

it seems to work

  int       float      string        bool 
"integer"    "double" "character"   "logical" 

Maybe you can just replace rjson with jsonlite without much work to do.

trevorld commented 6 years ago

Should be fixed now in the development version of argparse.

Thanks for the suggestion to use jsonlite instead of rjson. Had to update the code that supports the version action but otherwise was a pretty straightforward substitution of packages.