Closed dabrahams closed 2 years ago
This looks like a real bug.
If you need to reproduce, clone SwiftMarpa, install Swift if necessary, and invoke “swift test” from the root directory. With a breakpoint here, LLDB will show you the values cited.
You may need a "${prefix}/lib/pkgconfig/libmarpa.pc"; mine is
prefix="/opt/homebrew/Cellar/libmarpa/pkg-config"
exec_prefix="${prefix}"
libdir="${prefix}/lib"
includedir="${prefix}/include"
Name: libmarpa
Description: C language library that implements the parse engine at the core of Marpa
URL: https://jeffreykegler.github.io/Marpa-web-site/libmarpa.html
Version: 8.6.2
Cflags: -I"${includedir}"
Libs: -L"${libdir}" -lmarpa
I may proceed by re-implementing your test in CMake. I'm not inclined to try to debug the problem from inside Swift -- it's a totally new environment to me and, anyway, if the problem is going to be fixed, it has to be duplicate-able in C.
A (currently) pedantic point. I notice you get the values by accessing a structure, instead of calling the macros. This is not supported. I don't think it has anything to do with your problem, but in a future libmarpa, it may not work.
Presumably it would be worth your time to reduce the test a bit before you try to replicate it…but suit yourself.
🤷 Swift doesn't import function-style macros. I suppose I could have created a C shim that used the macros and re-exposed them as functions, and maybe I will do that eventually… but with macros like that any binary is tied to that version of marpa and would at least need to be recompiled to keep working if you changed the structure.
Actually, it should be sufficient to check the state of v
right after this line in your existing test.
The macros are documented as only valid after a call to marpa_v_step()
:
marpa_v_step() returns the type of step. Most step types have values associated with them. To access these values use the methods described in the section Basic step accessors
At the point where the problem occurs, has there been a call to marpa_v_step()
?
Yes, as noted in the last sentence of the issue description.
@dabrahams Actually, re your comment before last, I was checking that line, if not in the most graceful way. I cleaned it up in the commit just referenced, and it presents two mysteries:
First, it does not duplicate your bug.
Second, even so, it is not clear it is right. Shouldn't there be a MARPA_STEP_NULLING_SYMBOL
step? I set up the test so that the actual behavior, a MARPA_STEP_INACTIVE
step, is treated as correct, but I'm not sure it should be.
I still am feeling the lingering effects of that bug, so I will leave this until the AM.
I was checking that line
Specifically, were you checking the result of marpa_v_es_id(v)
and after that line has executed?
I still am feeling the lingering effects of that bug, so I will leave this until the AM
Indeed, buggy coders make for buggy code. Hope you feel better!
The value of marpa_v_es_id(v)
is indeterminate when the step type code is MARPA_STEP_INACTIVE
. I inserted a printf after the marpa_v_step()
line is executed, and FWIW it shows that the value is -1.
My plan is next to determine what libmarpa should be doing when stepping through a trivial parse. (By the way, trivial parses are special-cased -- I found that doing so makes for simpler and safer code, my present embarrassment notwithstanding.) I will want to do a bit of checking, so it looks like I was over-hasty before.
If libmarpa is not behaving correctly for trivial parses (which I suspect I will find is the case), the next step is to fix it. This should not be hard, due to the special-casing.
At that point we check for your issue again, to see if it is cleared up. If so, we declare victory and leave. If not, we proceed from there.
Progress is slow, because my personal bug is lingering, but I have some insight.
I note that my "trivial1" test does not call marpa_g_forced_value()
. Your Swift code does force "valued" status, which is a good idea and as per the API's recommendation. This difference would account for my previous failure to duplicate the bug.
My next step will be to add a test to the libmarpa test suite which duplicates this bug. I expect things will be straight-forward from there.
The commit last referenced duplicates the bug.
I have written up a new version of the documentation of the libmarpa ES location accessors, clarifying their expected behavior.
The expected ES location values of the nulled top node in a trivial parse are marpa_v_es_id (v) == 0
and marpa_v_token_start_es_id (v) == 0
, where v
is a valuator whose current step is that nulled top node.
This is NOT the current behavor of libmarpa. My plan:
With https://github.com/jeffreykegler/libmarpa/commit/7686e5361bb103192124beabba430b875d2510e3 I have finished all but the last step in the plan outlined above. The tasks accomplished include the fix itself and its libmarpa tests.
The one remaining task is to test this fix with Marpa::R2. Before I mark this issue "About to be closed", I will do this.
The fix tests OK in Marpa::R2 without any revisions. (Marpa::R2 special cases trivial parses, a practice I recommend for all wrappers.) I am marking this issue as read to be closed.
Do you have a link to details of what you mean by that recommendation?
Well, there is the Marpa::R2 code, but that makes a simple recommendation complicated. The idea is that the result of trivial parse is a constant, since the input is.
So the wrapper should check first if the input is zero length. Then it should check with marpa_g_symbol_is_nullable
that the start symbol is nullable. If not, it's a parse failure.
In the case of a successful parse of the zero length input, the result will be a constant, predictable in advance unless you're doing something wierd. Wierdness or no wierdness, Earley's algorithm has nothing to say that will help you in dealing with it.
If you'd like to a link to the Marpa::R2, let me know. Also, the IRC channel may contain people who have dealt with this. In particular, @jddurand, who usually is lurking, has probably dealt with this in some of his repos.
Here's where I special-case the semantics of nullable grammars in Marpa::R2: https://github.com/jeffreykegler/Marpa--R2/blob/2ef61bdf4d0ff2de3503ca136dbe4a18ff742f9b/cpan/lib/Marpa/R2/Value.pm#L1306 Now that I look at it, it may be the rest of the code handles zero-length inputs with the main code -- that's the only special case-ing of nullable grammars I can find.
What's the point in spending optimization effort on this edge case that will be very fast to evaluate anyway? Or am I failing to understand the scenario?
The point is not optimization (AFAICT special casing is no faster and unless the optimizer is very very smart will cause a tiny bit of code bloat). It's just that I have found that if you exclude zero-length parses from the main logic flow, you wind up with cleaner code. YMMV, as they say.
As an example, consider in marpa.w folding the code of
into
My experience/belief is that, yes, this replaces two flows of logic with one, but the ease of reading and debugging is better with the two logic flows. The second one is hardly what you'd call clean, but I'd say that all the more reason to be desperate to avoid complicating it further. The first one (for the trivial parse) can make quite a few helpful assumptions not valid in the general case, and keeping it separate means that pulling out the behavior for the trivial case is simple. If the two logics were merged it would be quite the task to figure out what happens in the trivial case and (as I have just had to) to change it without breaking something else.
From several experiences like that I've generalized to the principle "special case the trivial parse". A programmer with different experiences, and/or different talents, might reach a different conclusion.
Closed for the reason given in https://github.com/jeffreykegler/libmarpa/issues/92#issuecomment-1097289497.
This shows the state of a Valuator in my debugger:
Note that the current step type is 3 (
MARPA_STEP_NULLING_SYMBOL
) butt_ys_id
is -1, which is what gets returned bymarpa_v_es_id
. However, the docs for that function say:Presumably -1 is not a valid Earley set ordinal?
FYI, I got here by trying to follow the pattern of the
trivial1.c
test (albeit without the erroneous usages). The state is after the first call tomarpa_v_step
.