rtoy / maxima

A Clone of Maxima's repo
Other
0 stars 0 forks source link

Hash tables apply EQUAL as key test instead of ALIKE, leads to failure to match key #3929

Open rtoy opened 4 months ago

rtoy commented 4 months ago

Imported from SourceForge on 2024-07-08 21:18:57 Created by robert_dodier on 2022-09-23 02:41:14 Original: https://sourceforge.net/p/maxima/bugs/4030


CL hash tables (created explicitly by make_array(hashed) or implicitly by use_fast_arrays: true and then assigning to a subscripted variable) are created by Maxima with :test being EQUAL. See MAKE-EQUAL-HASH-TABLE in src/clmacs.lisp.

The key test for CL MAKE-HASH-TABLE can only be one of a few built-in functions, namely EQ, EQL, EQUAL, or EQUALP. That rules out using ALIKE for the test. Since ALIKE is called to test keys for the other kind of hash table (i.e., an "undeclared array"), that means CL hash tables and undeclared arrays can act differently.

In particular, when an expression car contains debugging stuff or other irrelevant flags, a key can fail EQUAL but pass ALIKE. I ran into this when working with code which was loaded from a file. When loaded from a file, expression cars contained debugging stuff and key matches failed. When the same code was copied and pasted into the interactive console, the debugging stuff was absent and keys matched.

It appears that there is no way to get MAKE-HASH-TABLE use a different test, so the resolution is probably to somehow get hash tables to punt to the same code as undeclared arrays.

rtoy commented 4 months ago

Imported from SourceForge on 2024-07-08 21:18:58 Created by macrakis on 2022-09-23 15:12:16 Original: https://sourceforge.net/p/maxima/bugs/4030/#c7bf


The only way I can see to guarantee correct behavior is to canonicalize array subscript values and use EQUAL. The vast majority of expressions will only have SIMP and ARRAY flags, and will have them in the correct order, so it won't be necessary to copy them. But it is necessary to traverse them completely. Fortunately, most array subscripts are small.

There is one bad case I can think of: expressions missing SIMP flags (using SIMP:FALSE). But since unsimplified expressions won't be canonicalized in other ways (e.g. variable ordering), and since SIMP:FALSE is explicitly documented as breaking most of Maxima, I think it's OK that a[] and a[] be considered different.

PS You might think that CREs need to be canonicalized, too (because the variable list can be different between otherwise equal values), but CREs and other special expressions are already disallowed as subscripts.

rtoy commented 4 months ago

Imported from SourceForge on 2024-07-08 21:19:02 Created by robert_dodier on 2022-09-24 03:56:11 Original: https://sourceforge.net/p/maxima/bugs/4030/#3c44


Well, I'm not in favor of making the test EQUAL all around. I bumped into the difference between EQUAL and ALIKE in what I believe is a totally innocuous and non-obtuse way -- keys which were in a script loaded by batch were different from keys typed into the console. I believe the answer has to be to make all hash tables apply ALIKE instead of EQUAL.

What I've been thinking about is how to make Maxima hash tables into values. (The important aspect of use_fast_arrays and make_array(hashed) is that hash tables are values instead of properties. It isn't important that the implementation be the Lisp implementation's hash table.) The Maxima hash table implementation comprises some stuff which attached to a gensym -- maybe to make Maxima hash tables into values, all that needs to be done is to define a Common Lisp struct comprises those bits, and put an instance of that struct into the value slot.

rtoy commented 4 months ago

Imported from SourceForge on 2024-07-08 21:19:05 Created by macrakis on 2022-09-24 16:55:21 Original: https://sourceforge.net/p/maxima/bugs/4030/#a10a


I agree that the current behavior is wrong. I proposed a way to fix it that uses :test equal -- why do you object to that?

Packaging fast arrays more cleanly is completely orthogonal to this issue -- perhaps open a different issue for that? The problem with using Common Lisp structs is that they are considered ATOMs, and as we've discussed before, Maxima assumes in many places that atoms can only be numbers or symbols.

rtoy commented 4 months ago

Imported from SourceForge on 2024-07-08 21:19:09 Created by robert_dodier on 2022-09-25 06:20:19 Original: https://sourceforge.net/p/maxima/bugs/4030/#a10a/78c9


I proposed a way to fix it that uses :test equal -- why do you object to that?

Hash tables should be predictable from a user's point of view, and it seems obvious to me that the test to be applied should therefore be whatever is applied by is(x = y), which calls ALIKE1. Undeclared arrays tests keys with ALIKE, which calls ALIKE1. Given that "fast" arrays (CL hash tables) apply a different test, their behavior isn't predictable from a user point of view. In particular, apparently identical expressions (the same except for car flags) fail EQUAL, and that makes its behavior unpredictable from a user's point of view.

Packaging fast arrays more cleanly is completely orthogonal to this issue -- perhaps open a different issue for that?

I guess I wasn't clear. The packaging is to encapsulate the Maxima undeclared array implementation, so that the same implementation can be used both for hash tables attached to symbol properties and for hash tables in the value slot. I'm not proposing to encapsulate "fast" (CL hash table) arrays -- they would become unused in this picture.

The important thing about "fast" arrays is that they are values instead of properties; this is the motivation for the business about a CL struct. "Fast" arrays are not, from what I know, actually any faster than the Maxima hash tables (I did try timing both kinds long ago).

The problem with using Common Lisp structs is that they are considered ATOMs, and as we've discussed before, Maxima assumes in many places that atoms can only be numbers or symbols.

I dunno, there isn't any new problem introduced by defining a new CL struct. Such a struct would replace CL hash tables, which are also atoms.

CL arrays also exist in Maxima, as created by make_array(<sometype>, nnn), and those are also atoms.

rtoy commented 4 months ago

Imported from SourceForge on 2024-07-08 21:19:12 Created by macrakis on 2022-09-26 19:23:15 Original: https://sourceforge.net/p/maxima/bugs/4030/#a168


I agree about predictability. That was the whole point of canonicalizing subscript values, so that EQUAL on canonicalized values has exactly the same behavior as ALIKE1. Or am I missing something?

rtoy commented 4 months ago

Imported from SourceForge on 2024-07-08 21:19:16 Created by robert_dodier on 2022-09-27 05:20:56 Original: https://sourceforge.net/p/maxima/bugs/4030/#a168/84f9


Sorry, I overlooked the part about canonicalizing values. EQUAL on canonicalized values would probably be pretty close to ALIKE1, but it's not guaranteed, short of reviewing ALIKE1 and ensuring that every corner case is treated the same. That's going to be subtle and susceptible to obscure bugs, and a nontrivial amount of work; it is a simpler, straightforward task to just ensure that whatever is done now for undeclared arrays is also done for "fast" arrays.

rtoy commented 4 months ago

Imported from SourceForge on 2024-07-08 21:19:19 Created by macrakis on 2024-04-08 13:59:00 Original: https://sourceforge.net/p/maxima/bugs/4030/#39e2


Sorry, I overlooked the part about canonicalizing values. EQUAL on canonicalized values would probably be pretty close to ALIKE1, but it's not guaranteed, short of reviewing ALIKE1 and ensuring that every corner case is treated the same. That's going to be subtle and susceptible to obscure bugs, and a nontrivial amount of work; it is a simpler, straightforward task to just ensure that whatever is done now for undeclared arrays is also done for "fast" arrays.

alike1 is not a complicated function. The only slightly complicated case is the way it handles Lisp arrays (which may not even be necessary -- see 4283).