Improve nginx/Openresty debugging

rulatir commented 4 years ago

I hereby challenge you to screencast and publish a video that demonstrates:

your product, ZeroBrane studio
in a remote debugging session,
successfully stepping through and stopping on breakpoints in openresty Lua code
that uses modebug.lua debuggee library
and runs in request context (as oposed to merely init_by_lua_*)
in relatively recent openresty (not a version from 2014 when you wrote your tutorial)
in a Docker container
on a Linux host

If you cannot produce this video, then you should add exceptions to your published claims that ZBS supports openresty.

I have sunk thousands of dollars worth of my work time over the last year-and-a-half trying to achieve usable step debugging of openresty Lua code. Mere listing of different error messages I have seen could fill a book. Had I been officially warned that my platform (or openresty in general) is not supported, I would have invested that time into developing alternative strategies for debugging.

I am currently convinced that neither ZBS nor mobdebug specifically can possibly work with a modern openresty, and your claims that they do amount to false advertising. In my extensive Googling I haven't seen a single crumb of evidence that it ever worked for anyone. Will you be able to convince me otherwise? All it takes is actual demonstration.

pkulchenko commented 4 years ago

It's unfortunate that you had to spent so much time trying to make it work for you, but I'm not sure what exactly you are questioning. If you suspect that what I published didn't work in 2014, then I presented a working example with an explanation of how to configure the application (I believe it was using content_by_lua_* or similar context, not init_by_lua_*). If you think that it may not work with the latest version of Openresty, it's definitely a possibility, as it was non-trivial to get it to work in the first place and I solicited the help of @agentzh to get it working. Openresty is using a non-blocking coroutine calls (to implement async IO/socket calls), which doesn't work with the debugger, as it relies on blocking during socket calls, so we resorted to using their blocking variants of coroutine methods (which is likely to be undocumented in Openresty and probably a subject to change). This mechanism has been documented in mobdebug (starting on line 54). There were also differences in stack generation in Openresty, which appears to have been changed around 0.9.20 and addressed in f903e41d.

Overall, there are several reasons why the debugging may not be working:

Blocking coroutine calls are no longer available as coroutine._* methods or their semantics have changed
The stack generation has changed and needs to be taken into account (undo/update changes in f903e41d).
Path mapping between the container and the host system is more complicated than expected by the debugger. See this discussion for details on this with a configuration similar to yours.
Something else.

We provide consulting services, so if you are interested in figuring out what's not working in your specific configuration and whether it can be fixed, let me know and I'm sure we can find some arrangement to work on this.

rulatir commented 4 years ago

When considering whether I should even go and check all those links you provided, I must wiegh the balance between predicted effort and estimated chance of success, and this balance does not look good after 18 months of consistent futility.

You are THE author of the software. Can YOU make it work in the environment specified above? Because if YOU can't, then no-one can, and YOU should stop claiming that it works in that environment. Claiming nonexistent compatibility is unethical. It actually harms people. Stop doing it. Edit or delete the tutorial, or mark it as work in progress. Edit your StackOverflow answers where you point to the tutorial. Edit your StackOverflow answers where you insist that it "should work". I am not soliciting your consulting services. I am asking that you cease a harmful practice.

pkulchenko commented 4 years ago

You are THE author of the software. Can YOU make it work in the environment specified above? Because if YOU can't, then no-one can, and YOU should stop claiming that it works in that environment.

Can you point me to where I claimed that it's supposed to work on Linux in a Docker container? I think the description you are referring to is explicit that the instructions are for Windows, but should work on Linux and macOS as well.

Claiming nonexistent compatibility is unethical. It actually harms people. Stop doing it. Edit or delete the tutorial, or mark it as work in progress.

I don't owe you anything. Are you familiar with the terms of the license this is made available under: THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESSED OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.

You have a specific environment where you expect the software to work, but it doesn't. Isn't it similar to you buying a pain reliever and then complaining that it doesn't alleviate your particular pain? Are you going to complain about false advertising? You haven't even paid anything in this case. You do realize that the debugging could stop working because of the changes in both Openresty and the IDE and that I nowhere claim that it's working with the current version of Openresty and don't have resources to test for the compatibility for each release (but will do if you decide to pay for it).

I go above and beyond to support those who are using this software and will answer (and often resolve) issues that are being reported in a matter of days. It's not my fault that you spent all this time without contacting me trying to resolve the issue you had with the debugging.

I hereby challenge you to screencast and publish a video that demonstrates:

your product, ZeroBrane studio

in a remote debugging session,

successfully stepping through and stopping on breakpoints in openresty Lua code

that uses modebug.lua debuggee library

and runs in request context (as oposed to merely init_bylua*)

in relatively recent openresty (not a version from 2014 when you wrote your tutorial)

in a Docker container

on a Linux host

To make the discussion a bit more productive: I'll accept your challenge if you invest in it as well. I'll demonstrate first 6 items on your list (with or without Mobdebug changes) if you put up $500 (USD) and all 8 items for $1000 (USD). Obviously, you don't need to pay anything if the demonstration fails. It will be done on the current version of Openresty (1.17.8.2) and on a version of Windows (for 6 items) and Linux/Docker (for 8 items) I select.

rulatir commented 4 years ago

I've been programming for 20+ years, so yes, I am aware that software is usually provided as is, without warranty of any kind etc. etc. Of all the software that I have used in my life, almost all of it playing fairly in the same league (i.e. provided "as is", "without warranty" etc. etc.), this little debuggee module has the distinction of being singularly responsible for the greatest number of grey hair on my head.

pkulchenko commented 4 years ago

I find it strange that in the 2 years you tried to get it working (judging by your SO comments) you haven't opened a single ticket or asked me a question, but are still blaming me for your struggles.

I got it working with the current version of Openresty (1.17.8.2) in my setup with the following mobdebug patch:

diff --git a/lualibs/mobdebug/mobdebug.lua b/lualibs/mobdebug/mobdebug.lua
index e135d6cb..09383cc2 100644
--- a/lualibs/mobdebug/mobdebug.lua
+++ b/lualibs/mobdebug/mobdebug.lua
@@ -55,10 +55,7 @@ local MOAICoroutine = rawget(genv, "MOAICoroutine")
 -- methods use a different mechanism that doesn't allow resume calls
 -- from debug hook handlers.
 -- Instead, the "original" coroutine.* methods are used.
--- `rawget` needs to be used to protect against `strict` checks, but
--- ngx_lua hides those in a metatable, so need to use that.
-local metagindex = getmetatable(genv) and getmetatable(genv).__index
-local ngx = type(metagindex) == "table" and metagindex.rawget and metagindex:rawget("ngx") or nil
+local ngx = rawget(genv, "ngx")
 local corocreate = ngx and coroutine._create or coroutine.create
 local cororesume = ngx and coroutine._resume or coroutine.resume
 local coroyield = ngx and coroutine._yield or coroutine.yield
@@ -566,7 +563,9 @@ local function debug_hook(event, line)
   -- 'tail return' events are not generated by LuaJIT).
   -- the next line checks if the debugger is run under LuaJIT and if
   -- one of debugger methods is present in the stack, it simply returns.
-  if jit then
+  -- Nginx requires a slightly different handling, as it creates a coroutine
+  -- wrapper, so this processing needs to be skipped.
+  if jit and not ngx then
     -- when luajit is compiled with LUAJIT_ENABLE_LUA52COMPAT,
     -- coroutine.running() returns non-nil for the main thread.
     local coro, main = coroutine.running()

You also need to make sure that the project directory points to the openresty folder, so that if your nginx.conf specifies something like content_by_lua_file 'lua/content.lua'; the project directory will be set to the openresty folder and lua/content.lua file will be opened in the IDE (otherwise stepping and breakpoints may not work).

Did we agree on the challenge? I can try to get the docker configuration working too, or you can see if this patch fixes the issue there as well...

rulatir commented 4 years ago

I find it strange that in the 2 years you tried to get it working (judging by your SO comments) you haven't opened a single ticket or asked me a question

I haven't opened any tickets because I have the habit of checking for duplicates. There were duplicates. At the time I checked them, they remained open, without a resolution in sight, and with comments by you in the style of "hey try ZBS" and "it should work", followed by some back-and-forth with well-meaning competent users reporting the same issues again and again and again, followed by the conversation fading out and the original submitter of the ticket never indicating success. The same pattern was clearly discernible in your StackOverflow activity, including one question I did actually ask, not to you personally, but you did answer - in the same vein. Forgive me if you didn't exactly seem helpful.

That said, also thanks for finally proving me wrong. Yes, the patch works, even with resty-cli which is an extra nice surprise. Apologies. This obviously settles the challenge. What remains is to try it in my actual app, which I will do later today.

You also need to make sure that the project directory points to the openresty folder, so that if your nginx.conf specifies something like content_by_lua_file 'lua/content.lua'; the project directory will be set to the openresty folder and lua/content.lua file will be opened in the IDE (otherwise stepping and breakpoints may not work).

That's a bit of a capricious requirement on IDE's part, and it remains to be seen if it's even doable in my actual application as opposed to hellolua. What is the status of path mapping? I've seen it discussed, I've seen PRs, but they are unmerged.

rulatir commented 4 years ago

However I must also ask about the future of this patch. Are you planning to merge it either into this project or into ZBS? (Is ZBS's mobdebug.lua upstream or downstream from this?). I can see that the last commit here is exactly 2 years old, to a day. In the meantime I have seen you paste multiple patches in SO answers and issue comments. Every such patch presumably restores lost compatibility, sometimes lost for months or years, all the while the README claims compatibility continually and uninterruptedly. The only way for anyone to learn about these patches is to go through at least some of the hell I went through. I do see it as something of a problem.

pkulchenko commented 4 years ago

The same pattern was clearly discernible in your StackOverflow activity, including one question I did actually ask, not to you personally, but you did answer - in the same vein. Forgive me if you didn't exactly seem helpful.

I understand why you may be disappointed, but SO is not designed for troubleshooting of issues and any back and forth to get additional information is difficult, as it's targeted at providing answers to specific questions.

If there are tickets that you feel duplicate your issue and have not been resolved, feel free to leave a comment and re-open the ticket if needed.

That's a bit of a capricious requirement on IDE's part, and it remains to be seen if it's even doable in my actual application as opposed to hellolua. What is the status of path mapping? I've seen it discussed, I've seen PRs, but they are unmerged.

There is nothing capricious about this requirement, as it relates to how openresty reports paths for scripts loaded from content_by_lua_file: since the path is specified as lua/file.lua, this is how it's reported to the debugger, so to make it work with breakpoints/stepping, the project directory has to be set to the nginx folder. There may be ways to deal with the remapping (and I'll check the existing remapping logic to see if it can cover this case), but this configuration will work out of the box.

However I must also ask about the future of this patch. Are you planning to merge it either into this project or into ZBS? (Is ZBS's mobdebug.lua upstream or downstream from this?).

Yes, I plan to do this shortly. The debugger changes are usually implemented in Mobdebug first (this project) and then integrated into ZBS.

The only way for anyone to learn about these patches is to go through at least some of the hell I went through. I do see it as something of a problem.

There are no outstanding patches that I know about that require integration. I may post a patch to confirm that the proposed change is working and then integrate it into the project. There are still a couple of patches where the decision on how to integrate them hasn't been made, but that's it.

What is the status of path mapping? I've seen it discussed, I've seen PRs, but they are unmerged.

The PR that you've seen is for the case of multiple remapped folders for the same project (which is a rare case). The case of the individual path remapping should already be handled in the IDE. It remains unmerged, because while it works, it requires correct specification of the remapped paths (which are project-specific) and I'm still researching if this remapping can be done without manual configuration.

rulatir commented 4 years ago

There is nothing capricious about this requirement, as it relates to how openresty reports paths for scripts loaded from content_by_lua_file: since the path is specified as lua/file.lua, this is how it's reported to the debugger, so to make it work with breakpoints/stepping, the project directory has to be set to the nginx folder.

Debuggers should not impose limitations on how either source files (in the monorepo) or runtime files (in the container) can be organized. Users of a debugger should be able to lay them out as it makes sense for the project, and to completely specify that layout to the debugger.

The PR that you've seen is for the case of multiple remapped folders for the same project (which is a rare case).

Single folder mapping is sufficient for small projects, and granted, most projects are small. Ours isn't. We don't use content_by_lua_file at all. We have lua_package_path set up with absolute paths, we require() and initialize our Lua framework in one *_by_lua_block, then we call the framework APIs in other *_by_lua_blocks. Those blocks, as well as nginx configs containing them, are generated from a model at build time. It's a big thing with dozens of location blocks implementing different endpoints on two server flavors (caching proxy and content server), each location block using different modularized functionalities from the framework. Lua code of the entire framework and application is under /lua. Configs are under /etc/nginx. Dynamic JSON configs read by the framework are still elsewhere. The layout is highly determined by production contstraints like what can be on EFS and what must be on EBS (and local development images are downstream FROM production images). There is literally no notion of "the nginx folder". This is your typical nontrivial application.

rulatir commented 4 years ago

Regarding your research as to whether this can be done without manual configuration, I don't think it can. I am not aware of any advanced remote debugger, IDE-integrated or not, that manages this automatically. IDEA doesn't (*). MS doesn't. Eclipse doesn't. Layout of runtime files in a remote, possibly runtime-optimized environment is strictly additional information that must simply be supplied.

The trick is of course to ensure that debugging would still work OOTB for simple projects. Perhaps manual config should simply take precedence when provided.

(*) Actually the ability to understand composer.json and PSR-4 helps IDEA a lot with PHP, but I don't think there is any comparable standard in the Lua world.

pkulchenko commented 4 years ago

@rulatir, I pushed the applied patch.

Debuggers should not impose limitations on how either source files (in the monorepo) or runtime files (in the container) can be organized. Users of a debugger should be able to lay them out as it makes sense for the project, and to completely specify that layout to the debugger.

Right now the debugger figures out the mapping and it actually covers all cases I came across (including cross-platform debugging) with the exception of the multi-path debugging that @moteus is addressing with his patch.

I applied another change that removes the restriction of where the project directory should be set. Let me know if it works for you.

rulatir commented 4 years ago

Almost there!

What's missing is the ability to actually step into files that are now correctly resolved to files on the host machine. Please bear with me, because this snippet from the Output pane (from a busted run) may look like the step-into action succeeded, but it really didn't from user's POV.

Paused at file spec/lib/zua/port-mapping_spec.lua line 29
[172.24.0.9:42780] Debugger received (file, line, err): spec/lib/zua/port-mapping_spec.lua  29  nil
[172.24.0.9:42780] Debugger sent (command): step
Paused at file /path-to-monorepo/images/openresty/lua/lib/zua/port-mapping.lua line 33
[172.24.0.9:42780] Debugger received (file, line, err): /path-to-monorepo/images/openresty/lua/lib/zua/port-mapping.lua 33  nil
[172.24.0.9:42780] Debugger sent (command): out
Paused at file spec/lib/zua/port-mapping_spec.lua line 30
[172.24.0.9:42780] Debugger received (file, line, err): spec/lib/zua/port-mapping_spec.lua  30  nil

I successfully suspend in a test (`port-mapping_spec.lua:29), at a breakpoint on a line that contains a call into the module under test.

I then manually issue step i.e. F10 - step into. I expect to step into port-mapping.lua:33, i.e. the module under test.

The debuggee seems to map the path correctly (using the project directory it received during the handshake), because what the next "Debugger received" line says is the correct path, the actual location of port-mapping.lua in my monorepo. Furthermore, the debuggee reports having actually paused there.

But the debugger (ZBS) does a weird thing: it immediately sends the out command all by itself. It is not me pressing Ctrl+F10 or the equivalent toolbar button. ZBS is doing it.

Next thing, debuggee suspends at the next line after "stepping out", i.e. at `port-mapping_spec.lua:30. ZBS moves the current instruction indicator as expected. No surprises here.

It seems that mobdebug manages to correctly translate the path, and sends correct path to the debugger, but the debugger just refuses to act on this information and instead steps out in panic. (Should I file this against ZBS?)

(I admit I achieved correct path translation with something of a hack: I run busty with LUA_PATH consisting of entries relative to the test directory, which is both the current directory when running busted, and the project directory in ZBS. So what I believe is happening inside mobdebug is that require "zua.port-mapping" is a hit against the ../../lua/lib/?.lua entry in LUA_PATH, so it is "found" by Lua as ../../lua/lib/zua/port-mapping.lua. Mobdebug prefixes that with the project directory, normalizes it, and sends it to ZBS.)

By the way, busted itself runs smoothly and all tests pass, so it is not an obvious misconfiguration of paths on Lua module resolution level.

pkulchenko commented 4 years ago

But the debugger (ZBS) does a weird thing: it immediately sends the out command all by itself. It is not me pressing Ctrl+F10 or the equivalent toolbar button. ZBS is doing it.

As far as I understand the situation, this is by design. You either need to have the file already opened in the IDE or specify editor.autoactivate=true in the config to step into files that are not already opened. See https://studio.zerobrane.com/doc-editor-preferences.

So what I believe is happening inside mobdebug is that require "zua.port-mapping" is a hit against the ../../lua/lib/?.lua entry in LUA_PATH, so it is "found" by Lua as ../../lua/lib/zua/port-mapping.lua. Mobdebug prefixes that with the project directory, normalizes it, and sends it to ZBS.)

Yes, this looks like the correct explanation of what happens. Are you running the patched/latest version of Mobdebug?

pkulchenko commented 4 years ago

@rulatir, there is one more reason to have the file already opened, as the IDE does a bit of fuzzy logic to identify the file to stop in when there is no a direct match in the project folder. The autoactivate option only works if there is a proper/direct match.

rulatir commented 4 years ago

As far as I understand the situation, this is by design. You either need to have the file already opened in the IDE or specify editor.autoactivate=true

That (and restarting ZBS) did the trick, thank you!

Yes, this looks like the correct explanation of what happens. Are you running the patched/latest version of Mobdebug?

Yes I am.

A random idea here - have you ever considered moving the path mapping responsibility from mobdebug to debugger(s)? I believe, and I think you'll agree, that writing debuggee instrumentation is super hard. Instrumentation's responsibilities should be strictly limited to providing the magic that an external remote process cannot perform, physically or informationally. If something can be done on debugger's side, then it should. Mapping between instruction pointer and source file:line is much harder with compiled languages, which is why compiled executables get instrumented with debug info. But with interpreted languages, I think debuggers should be expected to handle that.

pkulchenko commented 4 years ago

That (and restarting ZBS) did the trick, thank you!

So everything works so far?

have you ever considered moving the path mapping responsibility from mobdebug to debugger(s)? I believe, and I think you'll agree, that writing debuggee instrumentation is super hard.

I agree and have considered that, but I don't see a good way to do this. For example, the IDE sends a request to set a breakpoint on file content.lua, but the Lua interpreter running reports the path as lua\content.lua. Is it the same file? Without making that decision in Mobdebug, we end up either missing breakpoints or reporting possible breakpoints to the host (over socket), which is likely to further slow down the process.

rulatir commented 4 years ago

So everything works so far?

In a buster run, yes. Will check request code debugging tomorrow, with the multipath PR.

For example, the IDE sends a request to set a breakpoint on file content.lua, but the Lua interpreter running reports the path as lua\content.lua

This is the other mapping you are referring to, that between debuggee-local uniquely identifying absolute paths and the way Lua's native debugging facilities refer to .lua files or expect them to be referred to. This mapping must of course be handled by Mobdebug, and it must work two-way, for reporting current position and for setting breakpoints. It may involve prefixing/unprefixing cwd, it may involve symlink resolution (hopefully not, because reverse symlink resolution is a bully), it may involve forward and reverse package path resolution, but ultimately it should be a normalization/denormalization layer handling whatever the Lua interpreter throws at Mobdebug and normalizing it to/from debuggee-local absolute paths. The latter should be the only path format to appear on the wire between Mobdebug and the debugger.

The next two-way mapping, between debuggee-local paths and source tree, should be handled entirely by IDE/debugger. The IDE should not request to set a breakpoint in file content.lua. It should specify an absolute path in debuggee's pathspace. And when Mobdebug suspends, it should likewise report an absolute path in debuggee's pathspace.

pkulchenko commented 4 years ago

Will check request code debugging tomorrow, with the multipath PR.

I'd be interested if it works both with and without the PR.

The next two-way mapping, between debuggee-local paths and source tree, should be handled entirely by IDE/debugger. The IDE should not request to set a breakpoint in file content.lua. It should specify an absolute path in debuggee's pathspace. And when Mobdebug suspends, it should likewise report an absolute path in debuggee's pathspace.

Sort of; in fact, mobdebug is fairly transparent in how this is done and doesn't really care about how the breakpoints are set. It (mostly) removes the basedir from the path, but if you are already passing a relative name (in debuggee space) or an absolute name in that space, either one will work. It will also report back whatever it gets from the Lua interpreter when the debugging is stopped, leaving to the controller to interpret the path it receives.

Maybe it's more of a historical artifact of how the IDE was developed, but my idea was that the IDE receives a request for debugging (against the current project tree) and does its best to satisfy this request. This is one of the reasons why I tried to minimize the amount of configuration that has to happen in the IDE, as the same instance can be used to debug a variety of environments and interpreters (although not at the same time).

rulatir commented 4 years ago

I'd be interested if it works both with and without the PR.

Will do. But honestly, even if it happens to work without the PR, we still need a manual override for when the automagic fails.

It will also report back whatever it gets from the Lua interpreter when the debugging is stopped, leaving to the controller to interpret the path it receives.

I don't understand. I am seeing lines like this:

Paused at file /path-to-monorepo/images/openresty/lua/lib/zua/port-mapping.lua line 33
[172.24.0.9:47374] Debugger received (file, line, err): /path-to-monorepo/images/openresty/lua/lib/zua/port-mapping.lua 33  nil

These paths are in project space, not in debuggee space. I am pretty sure this isn't what mobdebug gets from the Lua interpreter, because the interpreter has no idea about project space. So either mobdebug in fact does the full path mapping and doesn't report what it gets from the Lua interpreter, or debugger does the mapping and logs mapped paths instead of what it actually receives from mobdebug. Which one is it? Or something else?

(To clarify, these log lines are from a debugging session that did everything right: it stopped on breakpoints and stepped into new files opening them in IDE. I am only asking for a clarification about the "path interchange format" between mobdebug and controller).

pkulchenko commented 4 years ago

I am pretty sure this isn't what mobdebug gets from the Lua interpreter, because the interpreter has no idea about project space. So either mobdebug in fact does the full path mapping and doesn't report what it gets from the Lua interpreter, or debugger does the mapping and logs mapped paths instead of what it actually receives from mobdebug. Which one is it? Or something else?

It depends on several factors (and may be something reported by the interpreter). It should be easy to check. Can you add printing of debug.getinfo(2, "S").source? I'd be curious to see what value is indeed being reported.

If the path starts with ../, then it's concatenated with the "basedir" value (which may be set based on the project path) to properly normalize the result. In all other cases it returns whatever is reported by the interpreter (with the basedir value stripped), so that the returned path is relative to the basedir. This logic is in https://github.com/pkulchenko/MobDebug/blob/master/src/mobdebug.lua#L629-L656. Also, do you see the message in the Output panel about the path being remapped (you probably should)?

You can change the basedir value on the fly if this is not what you want (require"mobdebug".basedir("newval")), but it's rarely needed. It's only needed if the "project" folder changes after the debugging has been started.

This logic has been honed by years of incremental changes to make it work in a variety of environments as well as with dynamically loaded fragments. It's quite possible that there are some cases where it breaks, but so far it's been doing quite well.

rulatir commented 4 years ago

Looks like it works when I cross my fingers.

Is it possible to set things up so that debugger wouldn't pause immediately after mobdebug.start(), only on breakpoints? Not having to manually make the server proceed after it stops on mobdebug.start() would be one step fewer in the whole squaredance (restart debugger server, reload nginx config).

pkulchenko commented 4 years ago

Is it possible to set things up so that debugger wouldn't pause immediately after mobdebug.start(), only on breakpoints?

@rulatir, yes, you can set debugger.runonstart = true in the config. See https://studio.zerobrane.com/doc-general-preferences#debugger.

You also should be able to un/set breakpoints while the script is running.

rulatir commented 4 years ago

Is there a way to somehow start the debug session OnceAndForGood™? No matter what I try, I either get "Refused to start a new debugging session" warnings, or I get symptoms consistent with session not being started in the context in which lines with breakpoints are hit (i.e. nothing happens), or still worse: nothing happens in ZBS, but request is stuck (presumably on a breakpoint).

pkulchenko commented 4 years ago

You need to make sure you call require("mobdebug").done() to finish the current session or use Project > Detach Process from the IDE, as otherwise you may get requests from multiple (nginx) threads requesting debugging, which is going to be rejected by the IDE.

rulatir commented 4 years ago

So this means that mobdebug/ZBS combo was never designed to handle debugging code that can be run in multiple threads, and debugging such code in such environments will always involve extreme squaredancing (manual operations in the IDE before / after request, modifying debugged code in multiple places rather than just one, etc.)?

rulatir commented 4 years ago

Not that this is a big loss. Debugging test cases works much more reliably, and that is where debugging should be done anyway. Live request debugging is for emergencies.

pkulchenko commented 4 years ago

So this means that mobdebug/ZBS combo was never designed to handle debugging code that can be run in multiple threads, and debugging such code in such environments will always involve extreme squaredancing (manual operations in the IDE before / after request, modifying debugged code in multiple places rather than just one, etc.)?

No, it will work and I have a commercial extension that supports debugging of multiple threads in the same IDE, but it uses an extended UI, as it presents a list of threads under debugging, ability to suspend/resume, switch between those threads (as the breakpoints or current pointer needs to be redrawn in the IDE) and some other things. It's not trivial and would definitely complicate a "normal" UI. In the default case only one session is allowed, but the IDE can handle more.

Debugging test cases works much more reliably, and that is where debugging should be done anyway. Live request debugging is for emergencies.

Right and the vast majority of users only deal with one context at a time. This was primarily designed for a multi-threaded environment (think about a game with a number of components, each running in its own Lua VM interacting over a message bus or a blackboard) and requires a bit of instrumentation (each thread may be assigned a name, which will appear on the list of threads in the IDE, so the user knows what to switch to, rather than a default port #).

rulatir commented 4 years ago

Experimenting further, I now have a small bit of code of interest between mobdebug.start() and mobdebug.done(), and a breakpoint on one of the lines in between. This code is run on every request exactly once. There are no other calls to mobdebug from my code.

ZBS pauses on the breakpoint exactly every other time (1st, 3rd time etc. successful, 2nd, 4th time etc. not). Restarting the debug server in ZBS does not reset the parity. Reloading nginx config does. Control does reach the breakpoint every time, verified by logging a message to error.log. The message is logged immediately before mobdebug.done().

Good run (up to pausing on breakpoint):

[172.19.0.10:50086] Debugger sent (command):    basedir /path-to-monorepo/images/openresty/lua/
New base directory is /path-to-monorepo/images/openresty/lua/
[172.19.0.10:50086] Debugger received (file, line, err):    nil nil nil
[172.19.0.10:50086] Debugger sent (command):    delallb
[172.19.0.10:50086] Debugger received (file, line, err):    nil nil nil
[172.19.0.10:50086] Debugger sent (command):    setb /path-to-monorepo/images/openresty/lua/app/feature/dynamic-proxy.lua 25
[172.19.0.10:50086] Debugger received (file, line, err):    app/feature/dynamic-proxy.lua   25  nil
[172.19.0.10:50086] Debugger sent (command):    load /path-to-monorepo/images/openresty/lua/app/feature/dynamic-proxy.lua
[172.19.0.10:50086] Debugger received (file, line, err):    /lua/app/feature/dynamic-proxy.lua  24  nil
[172.19.0.10:50086] Debugger sent (command):    basedir /path-to-monorepo/images/openresty/lua/ /lua/
New base directory is /path-to-monorepo/images/openresty/lua/
[172.19.0.10:50086] Debugger received (file, line, err):    nil nil nil
[172.19.0.10:50086] Debugger sent (command):    delallb
[172.19.0.10:50086] Debugger received (file, line, err):    nil nil nil
[172.19.0.10:50086] Debugger sent (command):    setb /path-to-monorepo/images/openresty/lua/app/feature/dynamic-proxy.lua 25
[172.19.0.10:50086] Debugger received (file, line, err):    app/feature/dynamic-proxy.lua   25  nil
Mapped remote request for '/lua/' to '/path-to-monorepo/images/openresty/lua/'.
Debugging session started in '/path-to-monorepo/images/openresty/lua/'.
[172.19.0.10:50086] Debugger sent (command):    run
Paused at file app/feature/dynamic-proxy.lua line 25
[172.19.0.10:50086] Debugger received (file, line, err):    app/feature/dynamic-proxy.lua   25  nil

Bad run (didn't stop on breakpoint):

[172.19.0.10:50052] Debugger sent (command):    basedir /path-to-monorepo/images/openresty/lua/
New base directory is /path-to-monorepo/images/openresty/lua/
[172.19.0.10:50052] Debugger received (file, line, err):    nil nil nil
[172.19.0.10:50052] Debugger sent (command):    delallb
[172.19.0.10:50052] Debugger received (file, line, err):    nil nil nil
[172.19.0.10:50052] Debugger sent (command):    setb /path-to-monorepo/images/openresty/lua/app/feature/dynamic-proxy.lua 25
[172.19.0.10:50052] Debugger received (file, line, err):    app/feature/dynamic-proxy.lua   25  nil
[172.19.0.10:50052] Debugger sent (command):    load /path-to-monorepo/images/openresty/lua/app/feature/dynamic-proxy.lua
[172.19.0.10:50052] Debugger received (file, line, err):    app/feature/dynamic-proxy.lua   24  nil
Debugging session started in '/path-to-monorepo/images/openresty/lua/'.
[172.19.0.10:50052] Debugger sent (command):    run
Program finished
[172.19.0.10:50052] Debugger received (file, line, err):    nil nil false
Debugging session completed (traced 0 instructions).

These execution paths diverge at line 8 where debugger receives different response to load command. In good cases it is an absolute path (and it is correct), and in bad cases it is a relative path (relative to root, i.e. missing the initial /).

pkulchenko commented 4 years ago

ZBS pauses on the breakpoint exactly every other time (1st, 3rd time etc. successful, 2nd, 4th time etc. not). Restarting the debug server in ZBS does not reset the parity. Reloading nginx config does.

Interesting; I suspect what happens is that the basedir value doesn't get reset after the first execution is completed. nginx is unique in a sense that the run is over (which usually means that the application is done), but mobdebug (and its state) is still in memory and is used for the next run.

[172.19.0.10:50052] Debugger sent (command): load /path-to-monorepo/images/openresty/lua/app/feature/dynamic-proxy.lua [172.19.0.10:50052] Debugger received (file, line, err): app/feature/dynamic-proxy.lua 24 nil

The app reported relative path, whereas it should have started with /lua/app/..., so it doesn't get remapped properly and, as the result, breakpoints are not working.

If you are already calling require("mobdebug").done(), can you add require("mobdebug").basedir("") after that to see if that fixes the issue? If it does, I'll find a way to inject it properly.

pkulchenko commented 4 years ago

Hm, actually no, this shouldn't be a problem, as the basedir is set explicitly at the beginning and lastsource value is reset when basedir is set. I'll have to think what else may be happening.

The issue is definitely with the path reported differently in response to the LOAD command, as you noted. Can you print debug.getinfo(2, "S").source? Is it possible that the interpreter indeed reports a different path on even runs?

rulatir commented 4 years ago

Actually yes! I added md.basedir("") right after md.done() and now it pauses every time.

rulatir commented 4 years ago

But there is one more interesting thing going on: when it does work, the next thing the debugger does after receiving response to load command is issuing basedir second time, and this time it sends two arguments:

basedir /path-to-monorepo/images/openresty/lua/ /lua/

And it's a tab character between them! I'm not sure if this is the exact command string or a logging artifact.

rulatir commented 4 years ago

From where do you want me to print debug.getinfo(2, "S").source? Between start() and done()?

pkulchenko commented 4 years ago

when it does work, the next thing the debugger does after receiving response to load command is issuing basedir second time, and this time it sends two arguments:

Yes, that's correct; it's an indication that the path is being remapped. You'll see a corresponding message in the output panel to that effect.

Actually yes! I added md.basedir("") right after md.done() and now it pauses every time.

Interesting. I'll have to add this to done() code then, as it's the right thing to do anyway (as the basedir command is not required to come before the debugging starts) and may appear at any time.

Where do you want me to print debug.getinfo(2, "S").source?

Before start() call. I'm curious if anything changes there between the runs. Unlikely, but just in case to cover that.

rulatir commented 4 years ago

I did both the second and first frame, because code of interest is directly called from another file which is even hit by a different package path entry. In both cases, the info is the same between even and odd requests, so that's covered.

rulatir commented 4 years ago

Will monkeypatch done() for now.

pkulchenko commented 4 years ago

Yes, something like this should work:

diff --git a/src/mobdebug.lua b/src/mobdebug.lua
index c4373d9..e18d7ca 100644
--- a/src/mobdebug.lua
+++ b/src/mobdebug.lua
@@ -760,6 +760,7 @@ local function done()
   coro_debugger = nil -- to make sure isrunning() returns `false`
   seen_hook = nil -- to make sure that the next start() call works
   abort = nil -- to make sure that callback calls use proper "abort" value
+  basedir = "" -- to reset basedir in case the same module/state is reused
 end

 local function debugger_loop(sev, svars, sfile, sline)

pkulchenko commented 4 years ago

@rulatir, applied and pushed the last patch. Does this cover all your use cases?

rulatir commented 4 years ago

Thanks! There are issues with step over (TL;DR it has the same effect as "continue"), but I'll report those separately.

pkulchenko commented 4 years ago

There are issues with step over (TL;DR it has the same effect as "continue"), but I'll report those separately.

You may want to check ZBS tickets as well, as it may have been already reported there.

pkulchenko / MobDebug

Improve nginx/Openresty debugging #51