exercism / elixir-representer

GNU Affero General Public License v3.0
10 stars 6 forks source link

Normalize struct names #37

Open jiegillet opened 2 years ago

jiegillet commented 2 years ago

This is mentioned in #29.

Struct names should be replaced by placeholders as well.

Example input

defmodule Something do
  defstruct [:a, :b]
  def something(%Something{a: a}), do: %Something{a: 2 * a}
  def something_else(%__MODULE__{a: a}), do: %__MODULE__{a: 3 * a}
end

Desired output

defmodule(PLACEHOLDER_1) do
  defstruct([:a, :b])
  def(PLACEHOLDER_2(%PLACEHOLDER_1{a: PLACEHOLDER_3})) do
    %PLACEHOLDER_1{a: 2 * PLACEHOLDER_3}
  end
  def(PLACEHOLDER_4(%__MODULE__{a: PLACEHOLDER_3})) do
    %__MODULE__{a: 3 * PLACEHOLDER_3}
  end
end

I think __MODULE__ should not be replaced, because it means different things in different modules. It's already normalized, in a way...

Current output

defmodule(PLACEHOLDER_1) do
  defstruct([:a, :b])
  def(PLACEHOLDER_2(%Something{a: PLACEHOLDER_3})) do
    %Something{a: 2 * PLACEHOLDER_3}
  end
  def(PLACEHOLDER_4(%__MODULE__{a: PLACEHOLDER_3})) do
    %__MODULE__{a: 3 * PLACEHOLDER_3}
  end
end
butterknight commented 2 years ago

Hi! 👋 I tried playing around with this in an effort to both learn Elixir and try to make something productive.

I created a new folder inside test_data with:

input.ex

defmodule Something do
  defstruct [:a, :b]
  def something(%Something{a: a}), do: %Something{a: 2 * a}
  def something_else(%__MODULE__{a: a}), do: %__MODULE__{a: 3 * a}
end

expected_representation.txt

defmodule Placeholder_1 do
  defstruct [:a, :b]

  def placeholder_2(%Placeholder_1{a: placeholder_3}) do
    %Placeholder_1{a: 2 * placeholder_3}
  end

  def placeholder_4(%__MODULE__{a: placeholder_3}) do
    %__MODULE__{a: 3 * placeholder_3}
  end
end

expected_mapping.json

{
  "Placeholder_1": "Something",
  "placeholder_2": "something",
  "placeholder_3": "a",
  "placeholder_4": "something_else"
}

And I wanted to see the current result so I ran the test I got this:

image

If I read this correctly, it would appear that the struct name is already properly normalised, but the key wasn't. Did I do something wrong or is this already partially implemented?

angelikatyborska commented 2 years ago

Hmm... @jiegillet is it possible that we accidentally implemented this in https://github.com/exercism/elixir-representer/pull/44 ?

jiegillet commented 2 years ago

Yes, it's entirely possible :) It happened when we normalized module names, the struct names followed along. I didn't close this issue because of the key behavior.

I don't necessarily think that we should normalize all map keys indiscriminately, but we certainly can do it for the ones in defstruct because they are user defined. In that sense the current behavior is a step in the right direction and what is missing is

defstruct [:placeholder_1, :placeholder_2] # or whatever numbers

@angelikatyborska what do you think? I also suspect that #44 may have introduced some undesirable behavior with key replacement, so I would suggest to test the representer on some larger codebase that uses maps in various ways...

Quick example: in import A, only: [function_a: 1] we don't want to touch only or function_a. Right now I am not confident that these are safe if we have a only variable somewhere else being normalized.

angelikatyborska commented 2 years ago

I don't necessarily think that we should normalize all map keys indiscriminately

I'm having a hard time coming up with a good rule about what to normalize when with regard to map, struct, and keyword list keys.

In general, it seems to me like key names are just like variable names and should be normalized. But then I start coming up with a bunch of exceptions: keys in structs that come from the standard library are always known and shouldn't be normalized. Keys in some special usage keyword lists like import A, only: ... or IO.inspect(x, label: "x"). I feel like trying to chase each exception will drive us crazy and I would much rather go all in or all out - normalize all of the keys or none of the keys. I'm also curious what other languages do with similar problems.

I'm currently writing a bunch of more tests that just pass to highlight the current behavior: https://github.com/exercism/elixir-representer/pull/73/files

For example struct names are only kept if they're not user defined, which is great. And you're right about this bug:

Right now I am not confident that these are safe if we have a only variable somewhere else being normalized.

angelikatyborska commented 2 years ago

@iHiD advised on Slack:

About representers: if you have key-value data structures, should you replace key names with placeholders?

If they don't matter to the implementation of the exercise (which as far as I can tell they shouldn't otherwise the tests would fail), then yes.

iHiD commented 2 years ago

Quick example: in import A, only: [function_a: 1] we don't want to touch only or function_a. Right now I am not confident that these are safe if we have a only variable somewhere else being normalized.

I know this isn't the point of this discussion, but removing import statements entirely is potentially quite a good idea. (F# does this: https://exercism.org/docs/tracks/fsharp/representer-normalizations#h-remove-import-declarations)

angelikatyborska commented 2 years ago

@iHiD What's the reasoning behind removing imports? I suspect they might not apply in Elixir.

iHiD commented 2 years ago

I'd sort of flip the question and ask what's the reason for keeping them?

The solution being normalised passes the tests, so it's importing whatever it needs to import. So the question becomes are they things you'd ever want to give feedback on?

My instinct is that they're not things that we need to give feedback on automatically, so increasingly the likelihood of normalisation feels better.

angelikatyborska commented 2 years ago

So the question becomes are they things you'd ever want to give feedback on?

Most of the time, yes.

I can see those scenarios for imports in Exercism exercises:

jiegillet commented 2 years ago

e.g. if someone imports A then B, and another student imports B then A, the placeholders will end up entirely different and so the representations will not normalise

That's a good idea, let's order imports :)

angelikatyborska commented 2 years ago

@jiegillet I take your comment to mean that you agree that imports should stay in the representations 😁 I created an issue about ordering them: https://github.com/exercism/elixir-representer/issues/75

iHiD commented 2 years ago

I can see those scenarios for imports in Exercism exercises:

While those are solid examples, all three of them feel like they could be really good analyzer checks that would cover all exercises and then allow the representations to be further normalised and then reduce the amount of feedback given manually in the automation UI.

But I'm entirely happy to defer to whatever you decide - you know the range of possibilities in Elixir much better than me! 🙂