dictation-toolbox / natlink

Natlink provides the interface between Dragon and python
Other
25 stars 17 forks source link

Live Debugging #135

Open Voxellence opened 2 years ago

Voxellence commented 2 years ago

Back in the year 2015 I was able to successfully attach the Visual Studio debugger to the instance of Python 2.7 in Dragon NaturallySpeaking. This allowed me to speak a NatLink/Dragonfly command and break into the debugger, step through code, observe variable values, etc. It also allowed even stepping into NatLink's Python code, what I have now come to understand is being called "natlinkcore".

Since the conversion of NatLink to work with Python 3.x and the migration by Microsoft in both Visual Studio and Visual Studio Code to a newer debugger called debugpy, I am no longer able to attach to the debugger the way I used to. Oddly enough, I have been able to attach by accident in VS Code, but I cannot seem to figure out how to reliably do so. Somehow, in just screwing around trying to get the debugger to attach, I swear it somehow attached successfully on two separate occasions, allowing me to step through code after having initiated the debugging session via voice command. And yet I spent all of yesterday in vain attempting to get it to connect again, which was maddening because I know for a fact that I saw it working with my own two eyes in days gone by.

I even went so far as reinstalling Visual Studio 2015 with the PTVS package, but even that will no longer connect. Also, VS 2015 only recognizes versions of Python up to 3.5, but I don't know whether that really mattered. What was interesting was that with the PTVS setup the dialog box that popped up actually stated that it was able to attach to the "Managed" code, but "Failed to attach to" ... "Python: An operation is not legal in the current state."

When you try this with Visual Studio 2022, you're given the option of connecting to sub-processes, but this really wasn't helpful. I have a suspicion that the fact of Visual Studio 2022 being now a 64-bit application could affect what's going on, although I am not certain. I also tried with VS 2017, no cigar. I don't believe I tried it with VS 2019, but why bother? In any case, I have failed utterly, except for a couple of accidents that I cannot duplicate the results of.

Any insight into this would be helpful, because I believe we could all benefit from it. Of course, one could go back to using Python 2.7 and an older version of NatLink and experience the same joys that I did heretofore, but what we're here for now is the current version of NatLink, right?

Now, when you're working in VS Code with debugpy and the latest version of NatLink, there isn't any real feedback when it fails. It simply fails. But I do know that it is at least possible, despite the apparent incompatibilities.

Voxellence commented 2 years ago

Regarding what may or may not be necessary in the scripts themselves, in the past I had code like this...

import ptvsd
ptvsd.enable_attach(secret = None)
# ...but I don't think it was really doing anything, unless I wanted to insert a
# hard-coded breakpoint into the script.

# With debugpy you can do this...
import debugpy
debugpy.listen("localhost", 5678)
# ...or just...
debugpy.listen(5678)
# ...since "localhost" is the default. And then there's this...
debugpy.wait_for_client()

But none of that was of any use. It only caused NatLink to put the script on the naughty list. So once again, it seems to me that the only use for importing debugpy is if you want to insert hard-coded breakpoints, but maybe I'm missing something.

Of course, it's also possible to start VS Code from the command line and include parameters for debugpy, but I'm not so sure that would yield any better results than attaching manually inside the IDE.

dougransom commented 2 years ago

i have on my list to re-add debugy to natlink and configure via the .ini file. we had it working about a year ago on another branch, and I just haven't finished the port into the current branch which has a rewrite of natlink. When I get to it it won't take long.

there is some code here you can maybe call in your own python for now. I will have it eventually do you break on startup, when you call a function, or issue a unimacro command. https://github.com/dictation-toolbox/natlinkcore/blob/main/src/natlinkcore/natlinkpydebug.py.

reckoner commented 2 years ago

I have been able to do this kind of debugging using wingware with Python 2.7. I have not had the chance to try all of the new 3.x natlink work. The main thing to do in wingware is import wingdbstub and then set a breakpoint in the IDE. You can get a free personal license for Wingware.

Voxellence commented 2 years ago

Wow, that's impressive, @dougransom. It looks like I just call start_dap() as a first step. I'm going to have a go at it and see how far I get.

Voxellence commented 2 years ago

Hello, @reckoner. I have actually thought of you every now and then, as I recall crossing paths with you maybe 10 or 15 years ago. Not sure exactly why now, as the sands of time and the trials of being human have ravaged my memory a bit in the interim. Let's see. Ah, yes, I have an old copy of your "alias that" Vocola command from 2007. I also have an old macro from 2008 where you're given credit. Looks like it was intended to find and enable NatLink. I still can't remember exactly why your name is so prominent in my memory, but I do know that I always thought it was a clever moniker. There was also someone who called him-/herself Xanadu that I interacted with some during that general time period.

Anyway, I'm rambling. Thanks for your input. I'm pretty much set on sticking with the Microsoft debugger though if at all possible, as I've used it forever going back to Visual Studio 6.

reckoner commented 2 years ago

@Voxellence Cool!

Voxellence commented 1 year ago

Well, I created my own simple script which looks like this...

from natlinkcore import natlinkpydebug

natlinkpydebug.debug_check_on_startup()

The script does nothing other than to call the debug_check_on_startup() function. This function prints to the screen the location of a document with instructions and then it calls start_dap() if __natLinkPythonDebugPortEnvironmentVar exists in os.environ. So apparently the start_dap() function is only intended to be called by debug_check_on_startup(). But if that be the case, then there's no reason for the if statement inside start_dap(), which performs that same test to see if __natLinkPythonDebugPortEnvironmentVar exists in os.environ, to be there.

If we do call start_dap() directly, then we're in trouble if __natLinkPythonDebugPortEnvironmentVar does not exist in os.environ. And you've given me no information as to how that environment variable is supposed to be set.

Therefore, when we step into the function, we see that the first thing that happens is the variable __debug_started is checked to see whether it's true or false. It is of course false on the first run, because that's the value it was initialized to at the top of the natlinkpydebug module.

Next a try/except block is arrived at wherein the first thing that happens is to see if the value of the variable __natLinkPythonDebugPortEnviornmentVar is in os.environ. Said variable was initialized to a value of "NatlinkPyDebugPort" at the top of the natlinkpydebug module, and indeed in my case this value does not exist in os.environ. This means that the variable natLinkPythonPortStringVal never gets initialized, and that's bad because the very next line of code which executes below the 'if' block is one which attempts to access and print this uninitialized variable.

This is clearly a bug, because the code in the module will always crash if the value being looked for is not present in os.environ. Also, the name of the variable needs to be corrected because it misspells the word "Environment".

Can you correct this code and further instruct me, @dougransom? If you'd like me to make the corrections, I suppose I could do that by forking and submitting a PR. Whichever you prefer is fine with me.

This does leave a question, though, which is: Since I know that the value "NatlinkPyDebugPort" is being searched for in os.environ, should I set up such an environment variable on my machine? I'm not familiar with working with environment variables in Python, so it's an area I need to learn more about. Frankly, I'm not a very good Python programmer at all, but hey, I'm learning.

I do also wonder why port number 7474 has been chosen instead of the default port number 5678. I have no preference, and am just curious as to what the motivation was behind this choice, or whether it is totally arbitrary.

Thanks in advance.

Voxellence commented 1 year ago

Alas, I have come to the conclusion that this code is pretty sloppy at best. There are some module level variables that are prefixed with dunders and others which are not. Some of these variables (ones without the dunder) appear to be there only for the purposes of testing and should be removed. In my opinion from a stylistic, practical, and readability perspective, all of the module level variables should be consistently prefixed with a single underscore, as we are not desiring to have any mangling done on them, so a single underscore would suffice to flag them as being module level variables at a glance.

As mentioned above, there is either a duplicate test for an environment variable or a bug, depending upon whether or not start_dap() is intended to be called independently of debug_check_on_startup() or not. If start_dap() is only intended to be called by debug_check_on_startup(), then the duplicate test for the environment variable should be removed.

Anyway, there's quite a bit of cleaning up that could be done, and I've been so focused on trying to understand what's going on with the code that I still haven't gotten it to work yet.

I'm going to do my own editing here and see if I can get it working now. It will make things much more simple for me to dispense with the debug_check_on_startup() function and just use start_dap(), although I'm going to rename that to something more straightforward. I would be more in favor of putting the code in a class and overriding a virtual method if/when another debugger is supported in the future, as opposed to naming functions "dap", etc.

Anyhow, for now I'm also going to strip out all the code which attempts to offer the user the option of changing the port number used by storing it in an environment variable. Hopefully I can boil this down to a simple implementation that is readable. Some of the unnecessarily verbose variable names contribute to the horror of trying to read the code, so I'm going to simplify those as well, and eliminate unnecessary variables altogether. I don't know whether my changes will be embraced, but that's how I'm gonna roll.

In the end, I still appreciate you posting this, @dougransom, despite its deficiencies, because if the central idea of the code does work then I'll be pretty excited. I'll report back here as to whether I've had any success or not.

Voxellence commented 1 year ago

Well, here's my edited version of the code in case anyone is interested. I'm not sure it's really necessary to be worried about the name of the executable in this way. Maybe there's a better way, but I haven't thought about it much.

import debugpy

# For now only the Python in system path can be used for NatLink and this module.
_python_exec = "python.exe"
_port_number = 7474
# debugpy.breakpoint()

def debugger_info():
    return f"""
Python Executable: {_python_exec}"
Port Number:{_port_number}
Is Client Connected: {debugpy.is_client_connected}
"""

def debugger_listen():
    try:
        debugpy.configure(python=f"{_python_exec}")
        debugpy.listen(_port_number)
        debugpy.wait_for_client()

        print(f"Listening on port {_port_number}")

    except Exception as ex:
        print(f"Exception '{ex}' while starting debugger. Python executable '{_python_exec}' may not be correct.")

Now comes the final step of actually using the remote debugging setup in a way that gets my debugger into the code. I need to study more about the debugger, I suppose.

Voxellence commented 1 year ago

Having boiled that code down to its essence, it becomes clear that it's really not much different than the code I started this issue with, that being the basic debugpy code required to allow the debugger to attach to a running Python script. However, the question remains: what is the order of operations required to reliably attach a debugger to NatLink macro code when it runs as a result of an utterance interpreted by Dragon, halting execution at a breakpoint situated on a line of code of the user's choosing?

There are two ways to allow the Microsoft debugger to attach to your running code:

  1. based upon process ID
  2. based upon a host and port number

Back in 2015 I was attaching based upon the process ID. Whereas, now it appears that it will be necessary to attach based upon a host and port number. Yet I cannot confirm at this time that even that will work. I am merely assuming, since I have utterly failed to attach based upon process ID the way I used to be able to do, that the method of attaching based upon host and port number might work. This method is usually used when the target code is running on a remote computer or inside of a Docker container, but in this case it is not running remotely, but rather is running inside the address space of natspeak.exe. Presumably, this method mitigates the compatibility issues present when attempting to attach based upon process ID, but this remains to be seen, by me at least. Perhaps Mr. Ransom knows what it is that I am missing here.

I have been able to attach to Python code based upon host:port when I run it from a PowerShell terminal, but I am still struggling to attach to code running inside of Dragon. I will be continuing to try to get this working, and will report back here when/if I succeed.

Voxellence commented 1 year ago

Well, it didn't take long to come to a final conclusion on this. First of all, it's time to pop the cork on a bottle of fine champagne! (Except that I'm not much of a drinker these days.) Nevertheless, if you haven't figured out how to break into the debugger as a result of an utterance, I'm here to tell you exactly how to do it.

Perhaps Mr. Ransom has some other bits of knowledge to offer. But the thing I was missing, the thing that made it work for me, was to put in a code breakpoint rather than relying upon creating breakpoints which appear as a red dot in the margin of Visual Studio Code. I will provide a step-by-step guide in a separate comment below, and if Quintijn would like, we can include this in the NatLink documentation. As I recall, there is actually a blank spot already prepared in the NatLink Help, but please do let me know exactly how I should proceed from here with regard to enhancing the documentation.

I would definitely like to see other people reporting here that they have been able to successfully debug their scripts in this way. So without further ado, I will proceed to outlining exactly how to debug your NatLink and Dragonfly macros by speaking them...

Voxellence commented 1 year ago

The NatLink README.md says near the bottom...

Debugging Without Dragon

If you know how to debug natlink when dragon is running, please update this section.

Therefore, rather than putting my detailed debugging instructions here, I'm going to create a new issue entitled Debugging NatLink when Dragon is running wherein I will post detailed instructions. Hopefully I can then get some guidance on exactly how to proceed with updating the documentation.

To @LexiconCode, thank you for your enthusiasm!