Closed engti closed 6 years ago
After some experimenting, it seems to work as expected when I set the Python path manually:
## load library
Sys.setenv(RETICULATE_PYTHON = "C:/Users/uname/.julia/v0.6/Conda/deps/usr/python.exe")
library(reticulate)
I am not too sure why I have multiple Python paths, when I only installed Anaconda once, and this is a laptop has been only imaged recently. And why it exists within the Julia path, I don't know.
xpost from here
Issue: When using pandas backfill function, the output is correct in a python notebook, but gives an incorrect result when the same code is called from R using the Reticulate package.
Context: I am trying to use a backfill function to do a last observation carried backwards. I used tidyr's fill function with .direction = "up" which works, but for my dataset it was taking more than an hour.
Sample Data: Minimal sample data is located here.
Jupyter Notebook Code So I used the following code in Python, which takes in the file, groups it by the user_id and then sorts by date and hour before applying the backfill function on the sale_index column:
I called the above code with the following:
Which gave me the correct output in 5 mins, which is awesome. Image of output here.
R Integration So far so good, but I am trying to integrate this into a single workflow within R as the plotting and the markdown creation is happening here. I used the Reticulate package to call the same function which was saved as backfill.py. But here there's an issue, it just didn't give me the correct output, unlike when I called it from iPython.
It would give me the following output - image
Any idea what's going on? It's the same code, line for line. For some reason it seems to ignore the grouping and sorting, and perform the back fill incorrectly according to some logic I don't understand.
Any help would be most appreciated. Thanks.
Config Info My py_config() results below: