Variables inside data transformation pipelines: only for certain dataframes?

seb-mueller commented 6 years ago

Having played with various completion managers (deoplete etc) for R, ncm + Nvim-R+ ncm-R appears most promising. Especially variable expansion inside data transformation pipelines is an awesome feature. However it seems not always to work for me.

As shown in the example, loading the flight dataset and having the cursor on POS1 (see below) expands the variables. So far so good. Next I tried to do the same with the standard iris data-frame, but when on POS2, nothing happens. First I thought it has to do with the global environment or the data frame being a tibble, but turning iris in a tibble to a new object (putting in the global env) should expand on POS3, which it doesn't. Any idea why this is? Thanks (also for this otherwise nice plugin!)

library(tidyverse)
library(nycflights13)

data(flights)

flights %>%
  select(POS1

iris %>%
  select(POS2

iristbl <- as.tbl(iris)

iristbl %>%
  select(POS3

gaalcaras commented 6 years ago

Hi there, thanks for the feedback and sorry for the late response.

It actually works for me every time:

completion works

Since I can't reproduce your issue, you're welcome to try and debug this for yourself (see this section of the README).

I don't think it has to do with iris being a tibble. You should try it with another tibble to see if it also fails or if it can also happen with a data.frame.

If it's of any help to you, this is the log message you should get when your cursor is on POS2:

[ncm-R] word: "", func: "select", pkg: None, pipe: iris

If you get something else, that means the problem lies with ncm-R code parser.

seb-mueller commented 6 years ago

Following your instructions, it seems ncm-R works fine for other types. I.e. if upon typing library( and having the cursor just after the bracket, gives my a list of all installed libraries putting this into the log:

2018-02-01 10:58:08,088 [INFO @ cm_core.py:cm_refresh:376] 758 - notify_sources_to_refresh calls cnt [0], channels cnt [1]
2018-02-01 10:58:08,090 [INFO @ cm_core.py:_refresh_completions:577] 758 - _refresh_completions names: ['R', 'cm-ultisnips', 'cm-filepath', 'cm-bufkeyword', 'cm-tmux'], startcol: 8, matches cnt: 0
2018-02-01 10:58:08,096 [INFO @ r.py:cm_refresh:286] 758 - [ncm-R] word: "", func: "library", pkg: None, pipe: None
2018-02-01 10:58:08,101 [INFO @ cm_core.py:cm_complete:244] 758 - update popup for [R]
2018-02-01 10:58:08,102 [INFO @ cm_core.py:_refresh_completions:577] 758 - _refresh_completions names: ['R', 'cm-ultisnips', 'cm-filepath', 'cm-bufkeyword', 'cm-tmux'], startcol: 9, matches cnt: 64

Likewise with data( list datasets correctly: [INFO @ r.py:cm_refresh:286] 758 - [ncm-R] word: "", func: "data", pkg: None, pipe: None

However having the cursor on select (i.e. POS1) doesn't give me a list:

[INFO @ r.py:cm_refresh:286] 758 - [ncm-R] word: "select", func: "", pkg: , pipe: flights

Note, in contrast to my initial report, this has stopped to working (i.e. not given me a list) for some reason as well. The same also happens for iris (bit more context of the log):

2018-02-01 11:05:48,259 [INFO @ cm_core.py:cm_refresh:376] 758 - notify_sources_to_refresh calls cnt [1], channels cnt [1]
2018-02-01 11:05:48,260 [INFO @ cm_core.py:_refresh_completions:577] 758 - _refresh_completions names: ['R', 'cm-filepath', 'cm-bufkeyword', 'cm-tmux'], startcol: 9, matches cnt: 0
2018-02-01 11:05:48,329 [INFO @ r.py:cm_refresh:286] 758 - [ncm-R] word: "", func: "select", pkg: None, pipe: iris
2018-02-01 11:05:48,334 [INFO @ cm_core.py:cm_complete:244] 758 - update popup for [R]
2018-02-01 11:05:48,335 [INFO @ cm_core.py:_refresh_completions:577] 758 - _refresh_completions names: ['R', 'cm-filepath', 'cm-bufkeyword', 'cm-tmux'], startcol: 10, matches cnt: 1
2018-02-01 11:06:19,779 [INFO @ cm_tmux.py:cm_event:36] 758 - refresh_keyword on event FocusGained
2018-02-01 11:06:19,783 [INFO @ cm_tmux.py:refresh_keyword:53] 758 - list-window: 0

Seems like matches cnt = 0 migth be the culprit, but not sure what to do about it in terms of debugging.

NVIM v0.2.0-6-g3979c6c Build type: Debug

I tried it ncm for ncm-R v0.6 and v0.7 with similar results.

P.S. Having searched around a bit more, I've noticed that the GlobalEnvList_* file in the g:rplugin_tmpdir is empty, but the globalenv_* seems to have object info in it:

.GlobalEnv | Libraries

   &#flights    
   [#iris2   [150, 5]
   ├─ {#Sepal.Length    
   ├─ {#Sepal.Width 
   ├─ {#Petal.Length    
   ├─ {#Petal.Width 
   └─ '#Species 
   &#iris

Happy to provide more info if needed and thanks for looking into it!

gaalcaras commented 6 years ago

Okay, I think we're looking at two different things. The GlobalEnvList_* stuff has more to do with Nvim-R, so let's leave this one for later.

It does seem like the code parser of ncm-R is failing somehow to detect your select( as a piped function. The code for that is here:

https://github.com/gaalcaras/ncm-R/blob/master/pythonx/rlang.py

Maybe this is as simple as a Regex problem, or it could very well be that my parsing function is flawed. I'd greatly appreciate it if you took some time to debug it: if you find what went wrong, we could not only fix a bug but also think about writing some tests to ensure ncm-R is more reliable in the future.

Thanks a lot for your feedback and keep me posted!

seb-mueller commented 6 years ago

Happy to try some debugging, but I'm not very familiar with ncm or how python plays with vim in general. Is there any way to call this python script externally (surpassing vim) for debugging so I can inspect the code at runtime?

Also, the piped function detection seems to work for iris [ncm-R] word: "", func: "select", pkg: None, pipe: iris but not flights data [ncm-R] word: "select", func: "", pkg: , pipe: flights (see also last post). The detection of flight dataset seems to be not working only, however there isn't a list shown in either case. There is probably something wrong with filtering or so.

gaalcaras commented 6 years ago

Is there any way to call this python script externally (surpassing vim) for debugging so I can inspect the code at runtime?

There is actually. For the rlang module (and some other modules I guess), it would be pretty easy. Most of its functions take three arguments: buff, linenum and numcol. The last two relate to the position of the cursor in your buffer and buff contains the buffer's content (it's actually a list of all the lines).

For instance, to test POS1 you could do something like:

import rlang

buff = ['library(tidyverse)', 'library(nycflights13)', '',
        'data(flights)', '', 'flights %>%', '  select(']

function = rlang.get_function(buff, 7, 10)
print(function)

I didn't actually run the code, so it might be slightly off but that's the general idea.

If you want to test the whole plugin, it's possible using the Neovim Python API (you can look at how they do their own testing if you're interested) but it's a lot harder, because ncm-R has a lot of moving parts (talking to Neovim, Nvim-R and to NCM). This is why I usually debug with two tmux panes, one on the left with Neovim and one the right to monitor the logs in real time. It's far from ideal, but I can usually pinpoint the bug after some trial and error.

seb-mueller commented 6 years ago

Thanks for the instructions. Having followed your suggestions I got the following results (likewise for flights):

print(rlang.get_function(buff, 7, 10))
#[None, 'select']
print(rlang.get_pipe(buff, 7, 10))
#iris

So I suppose the regex seems to work. As next I wanted to to inspect the workspace and followed again your instructions. Briefly, I've started nvim this way:

NVIM_PYTHON_LOG_FILE=nvim.log NVIM_PYTHON_LOG_LEVEL=INFO NVIM_LISTEN_ADDRESS=/tmp/nvim nvim

Open 2 new tmux panes, one monitoring the logs tail -f nvim.log_py3_cm_core and the other one using firing up a python shell ipython3, which gives me access to the vim buffer etc. as described in the python API guide:

from neovim import attach
# Create a python API session attached to unix domain socket created above:
nvim = attach('socket', path='/tmp/nvim')
buffer = nvim.current.buffer
print(rlang.get_pipe(buffer, 7, 10))
# flights

I haven't worked with this API before and was wondering how to inspect the actuall pyton objects? Is there a way to import it similar to accesing the buffer (i.e. using the above nvim object). In particular the Rsource object and CoreHandler seems to contain all the relevant info, but I don't know how to access it. Hope that is not too confusing, but thanks for any pointers.

Also I'm curious which function is actually been called at POS1? cm_core.py:_refresh_completions?

gaalcaras commented 6 years ago

So I suppose the regex seems to work.

OK, that's good I suppose :)

I haven't worked with this API before and was wondering how to inspect the actuall pyton objects? Is there a way to import it similar to accesing the buffer (i.e. using the above nvim object). In particular the Rsource object and CoreHandler seems to contain all the relevant info, but I don't know how to access it. Hope that is not too confusing, but thanks for any pointers.

Interesting question! I've never really looked into it, but my guess is you would have to talk to the rpc server directly. Look at the end of https://github.com/roxma/nvim-completion-manager/wiki/Trouble-shooting, it seems like you can start a source manually, so maybe there's a clever hack to attempt here.

But honestly, NCM clearly favors logs for debugging, otherwise it would probably have unit tests by now. For example, it's easy to output objects to the logs with LOGGER.info() and ncm-R builds on that in rsource. I think it's going to be complicated to do what you want to do without patching NCM yourself. If you find a way to do it though, I'd be interested.

Also I'm curious which function is actually been called at POS1? cm_core.py:_refresh_completions?

As far as ncm-R is concerned, cm_refresh is indeed the method that's used by NCM to update the list of suggestions (it's automatically triggered on the basis of the cm_refresh_patterns or manually with cm_force_refresh). Then it calls the complete method, which as I understand it basically displays the pop-up menu.

seb-mueller commented 6 years ago

Thanks for all the info. I've played around a bit but didn't really understand the structure and interplay between ncm and ncm-R to debug the right objects. I've tried to add LOGGER.info() but found probably all info are already generated ncm itself by a few LOGGER.debug calls which is captured by the debug log level (as opposed to info log level I used before):

 NVIM_PYTHON_LOG_FILE=nvim.log NVIM_PYTHON_LOG_LEVEL=DEBUG NVIM_LISTEN_ADDRESS=/tmp/nvim nvim ncm_test.r

I've extracted the output into a file (debug_library_showed_list.txt) for a working case where the cursor is at library( correctly showing a list (i.e. including cowplot etc.) and anther file with the cursor sitting at POS1 not showing a list (debug_filter_iris2_failed.txt).

Since it's a very complex output I couldn't make much sense of it, maybe it's of use for you. Anyway, it turned out to be rather complex and I wouldn't mind dropping the issue altogether (P.S. I've got the same issue on different machines with updated nvim and ncm version as of today).

debug_filter_iris2_failed.txt debug_library_showed_list.txt

gaalcaras commented 5 years ago

I'm closing this issue since it's about an older (and deprecated) version of ncm-R. But feel free to open a new issue if you run into a similar bug with the latest version of ncm-R.

gaalcaras / ncm-R

Variables inside data transformation pipelines: only for certain dataframes? #6