WerWolv / ImHex

🔍 A Hex Editor for Reverse Engineers, Programmers and people who value their retinas when working at 3 AM.
https://imhex.werwolv.net
GNU General Public License v2.0
43.09k stars 1.89k forks source link

[Bug] Struct type parameters bind to variables when evaluated, not when defined #1284

Open bgilbert opened 1 year ago

bgilbert commented 1 year ago

Operating System

Linux

What's the issue you encountered?

If an outer scope S has a variable v and creates a parameterized type A<B<v>>, but A also has a variable called v, A will see the type B<v> bound to its own value for v rather than to S's value. If A does not have a variable called v, A will see v bound to S's value, as expected.

This behavior seems surprising, since S probably expected to pass v by value, independent of any variable names that A may happen to use.

How can the issue be reproduced?

As expected, this prints 5:

#include <std/io.pat>
#include <std/mem.pat>
// "auto C" to work around https://github.com/WerWolv/ImHex/issues/1283
struct S<T, auto C> {
    std::print("{}", sizeof(T));
};
u8 count = 5;
S<std::mem::Bytes<count>, 7> a;

But this prints 2:

#include <std/io.pat>
#include <std/mem.pat>
struct S<T, auto C> {
    u8 count = 2;
    std::print("{}", sizeof(T));
};
u8 count = 5;
S<std::mem::Bytes<count>, 7> a;

ImHex Version

1.30.1

ImHex Build Type

Installation type

Fedora (imhex-1.30.1-3.fc38.x86_64)

Additional context?

No response

paxcut commented 1 year ago

i dont think you can override a local variable by creating a global with the same name. The following is what works

#include <std/io.pat>
#include <std/mem.pat>
struct S<T, auto C> {
    u8 count = 2;
    std::print("{}", sizeof(T));
};
S<std::mem::Bytes<count>, 7> a;

so no matter what you define this will always return 2. I am not sure what the intended behavior would be when you define a global variable called count. Yuo can interpret it as the global being assigned to the template which is then passed to the struct to create a T of 5 bytes, but you can also see it as T not being evaluated until sizeof(T) is called in which case the local variable takes precedence.

bgilbert commented 1 year ago

Based on the behavior of other programming languages, I expected count to be passed by value, not by name. For example, in Python, I'd be surprised if this printed 2:

def f(v):
    count = 2
    print(v)

count = 5
f(count)

If the current behavior is intended, it'd be good to update the documentation to make this clear. The docs currently say:

Templates can be used to substitute parts of a custom type’s member’s types with placeholders which can then be defined later on when instantiating this type.

when instantiating this type is perhaps ambiguous if you squint at it, but I think the most natural reading is consistent with the behavior I was expecting.

paxcut commented 1 year ago

that python example is not a good analogy because this code

def f(v):
    count = 2
    print(v)

f(count)

is not valid python code. I think the issue is that given that

// Code A
#include <std/io.pat>
#include <std/mem.pat>
struct S<T, auto C> {
    u8 count = 2;
    std::print("{}", sizeof(T));
};
S<std::mem::Bytes<count>, 7> a;

and

// Code B
#include <std/io.pat>
#include <std/mem.pat>
struct S<T, auto C> {
    std::print("{}", sizeof(T));
};
u8 count = 5;
S<std::mem::Bytes<count>, 7> a;

work and give different answers what should the following code return?

// Code C
#include <std/io.pat>
#include <std/mem.pat>
struct S<T, auto C> {
    u8 count = 2;
    std::print("{}", sizeof(T));
};
u8 count = 5;
S<std::mem::Bytes<count>, 7> a;

The fact that Code A works means that count is not just a variable that needs to be defined before the instantiation of the template occurs. It is clear that T is not evaluated and bound until sizeof(T) is called as there is no need to bind it until then. Under those conditions one would expect that the definition closest to the usage of the template argument to take precedence and in this case that is the local variable.

To try to make analogies with other languages you need to use a language that would allow Code A, or something like it, to work. The pattern language is flexible enough to allow both Code A and Code B to be valid and work as intended. I am not sure what that means technically speaking about the behavior of Code C, but the fact that it doesn't work as most programming languages should be apparent given that Code A works.

bgilbert commented 1 year ago

My point was that, as a new user of pl, I expected similar-looking syntax to work similarly to other languages. I didn't expect Code A to be valid syntax, so had no reason to try it. I expected Code B to work, and it did. But then I introduced what I thought was an unrelated variable in Code C, and was surprised when it changed the result.

I don't think the current behavior is ideal, since it violates encapsulation: callers must care about variable names that happen to be used in their callees. However, if this behavior is intentional, or if converting Code A to an error would be too disruptive, I agree that Code C's behavior is consistent with the current design. In that case, the pass-by-name behavior should be documented explicitly, since it isn't necessarily obvious that template evaluation works that way.

paxcut commented 1 year ago

I am sorry if I gave the impression that the question was not a valid one. I believe that this is the first time this has come up and it merits further discussion as you well point out. I was merely trying to deduce why the behavior doesn't match expectations. I have been using pl for a few months quite extensively but I don't know strict answers to these ambiguous notation questions that the language introduces. I don't know what the intent of the creator of the language was so all I can do is try to deduce it using evidence based on its current behavior.

The issue with documentation is well known and duly noted. I believe that in its current states, ImHex documentation serves more as a guideline for its use more than as a full explanation of the pattern language. This is somewhat expected in a volunteer effort where everybody is welcome to edit the documentation by submitting PRs to its repo. It is not surprising that the code receives a lot more attention that the documentation in terms of submissions. I don't mean to make this sound as an excuse, it is merely a personal observation.

In my attempt to provide answers I may have neglected to state it, but I'd like to assure you that I agree with all the points you make. I hope you can forgive me if it seemed that I did not.

bgilbert commented 1 year ago

No worries, that makes sense. Thanks for digging into this.

github-actions[bot] commented 1 month ago

This issue is marked stale as it has been open for 11 months without activity. Please try the latest ImHex version. (Avaiable here: https://imhex.download/ for release and https://imhex.download/#nightly for development version) If the issue persists on the latest version, please make a comment on this issue again

Without response, this issue will be closed in one month.

bgilbert commented 1 month ago

Behavior still exists in 1.35.4.