Open jiegillet opened 3 years ago
Hi! 👋 I tried playing around with this in an effort to both learn Elixir and try to make something productive.
I created a new folder inside test_data
with:
input.ex
defmodule Something do
defstruct [:a, :b]
def something(%Something{a: a}), do: %Something{a: 2 * a}
def something_else(%__MODULE__{a: a}), do: %__MODULE__{a: 3 * a}
end
expected_representation.txt
defmodule Placeholder_1 do
defstruct [:a, :b]
def placeholder_2(%Placeholder_1{a: placeholder_3}) do
%Placeholder_1{a: 2 * placeholder_3}
end
def placeholder_4(%__MODULE__{a: placeholder_3}) do
%__MODULE__{a: 3 * placeholder_3}
end
end
expected_mapping.json
{
"Placeholder_1": "Something",
"placeholder_2": "something",
"placeholder_3": "a",
"placeholder_4": "something_else"
}
And I wanted to see the current result so I ran the test I got this:
If I read this correctly, it would appear that the struct name is already properly normalised, but the key wasn't. Did I do something wrong or is this already partially implemented?
Hmm... @jiegillet is it possible that we accidentally implemented this in https://github.com/exercism/elixir-representer/pull/44 ?
Yes, it's entirely possible :) It happened when we normalized module names, the struct names followed along. I didn't close this issue because of the key behavior.
I don't necessarily think that we should normalize all map keys indiscriminately, but we certainly can do it for the ones in defstruct
because they are user defined. In that sense the current behavior is a step in the right direction and what is missing is
defstruct [:placeholder_1, :placeholder_2] # or whatever numbers
@angelikatyborska what do you think? I also suspect that #44 may have introduced some undesirable behavior with key replacement, so I would suggest to test the representer on some larger codebase that uses maps in various ways...
Quick example: in import A, only: [function_a: 1]
we don't want to touch only
or function_a
. Right now I am not confident that these are safe if we have a only
variable somewhere else being normalized.
I don't necessarily think that we should normalize all map keys indiscriminately
I'm having a hard time coming up with a good rule about what to normalize when with regard to map, struct, and keyword list keys.
In general, it seems to me like key names are just like variable names and should be normalized. But then I start coming up with a bunch of exceptions: keys in structs that come from the standard library are always known and shouldn't be normalized. Keys in some special usage keyword lists like import A, only: ...
or IO.inspect(x, label: "x")
. I feel like trying to chase each exception will drive us crazy and I would much rather go all in or all out - normalize all of the keys or none of the keys. I'm also curious what other languages do with similar problems.
I'm currently writing a bunch of more tests that just pass to highlight the current behavior: https://github.com/exercism/elixir-representer/pull/73/files
For example struct names are only kept if they're not user defined, which is great. And you're right about this bug:
Right now I am not confident that these are safe if we have a only variable somewhere else being normalized.
@iHiD advised on Slack:
About representers: if you have key-value data structures, should you replace key names with placeholders?
If they don't matter to the implementation of the exercise (which as far as I can tell they shouldn't otherwise the tests would fail), then yes.
Quick example: in import A, only: [function_a: 1] we don't want to touch only or function_a. Right now I am not confident that these are safe if we have a only variable somewhere else being normalized.
I know this isn't the point of this discussion, but removing import statements entirely is potentially quite a good idea. (F# does this: https://exercism.org/docs/tracks/fsharp/representer-normalizations#h-remove-import-declarations)
@iHiD What's the reasoning behind removing imports? I suspect they might not apply in Elixir.
I'd sort of flip the question and ask what's the reason for keeping them?
The solution being normalised passes the tests, so it's importing whatever it needs to import. So the question becomes are they things you'd ever want to give feedback on?
only: [...]
) or imports something but doesn't use it).My instinct is that they're not things that we need to give feedback on automatically, so increasingly the likelihood of normalisation feels better.
So the question becomes are they things you'd ever want to give feedback on?
Most of the time, yes.
I can see those scenarios for imports in Exercism exercises:
:only
vs :except
would be good.e.g. if someone imports A then B, and another student imports B then A, the placeholders will end up entirely different and so the representations will not normalise
That's a good idea, let's order imports :)
@jiegillet I take your comment to mean that you agree that imports should stay in the representations 😁 I created an issue about ordering them: https://github.com/exercism/elixir-representer/issues/75
I can see those scenarios for imports in Exercism exercises:
While those are solid examples, all three of them feel like they could be really good analyzer checks that would cover all exercises and then allow the representations to be further normalised and then reduce the amount of feedback given manually in the automation UI.
But I'm entirely happy to defer to whatever you decide - you know the range of possibilities in Elixir much better than me! 🙂
This is mentioned in #29.
Struct names should be replaced by placeholders as well.
Example input
Desired output
I think
__MODULE__
should not be replaced, because it means different things in different modules. It's already normalized, in a way...Current output