exercism / elixir-analyzer

GNU Affero General Public License v3.0
30 stars 32 forks source link

Redundant advice on the German Sysadmin exercise #384

Closed fjrivasf closed 1 year ago

fjrivasf commented 1 year ago

The analyzer is telling me to use the ?(char) syntax, even though that's exactly what I'm already doing.

Elixir has a special syntax to write integers that correspond to Unicode Code Points: you can use a ? in front of the character literal to reveal its code point.

For example,

?a === 97 ?\s === 32 ?ß === 223 ?互 === 20114 In the context of manipulating code points, this notation is much more readable than numbers and should always be preferred.

I'm using module attributes and comprehensions to deal with the filtering part of this problem. My guess is that the comprehension is being expanded at compile time, and that the analyzer is seeing raw digits even though the actual source code has none.

@alphabet for x <- ?_..?z, x != ?`, do: x
@german_chars 'äöüß'
@german_alphabet @german_chars ++ @alphabet

  def sanitize(''), do: ''
  def sanitize([h|t]) when h not in @german_alphabet, do: sanitize t
  def sanitize([h|t]) when h in @alphabet, do: [h | sanitize t]
angelikatyborska commented 1 year ago

Hi! You're right that the feedback in this case is wrong, but the cause is a bit different than that.

The check is done on a string with the source code, because as you noticed, after compilation you cannot tell anymore which syntax was used to create the integer value. The check is this:

integers = ["?ß", "?ä", "?ö", "?ü", "?_", "?a", "?z"]
Enum.all?(integers, &String.contains?(code_string, &1))

We knew it will be buggy when we wrote it, but it's an very important check for Elixir beginners (not knowing about the ?A syntax is one of the common mistakes I saw in charlist related exercises when I was mentoring them) so we settled on a buggy check to have something working at least in some cases.

I'm open to ideas how to improve it, I don't have any at the moment.

fjrivasf commented 1 year ago

Thank you for the explanation! That makes a lot of sense.

After reading the motive behind it, I don't think the current approach is wrong. I can't think of many other ways to check for the specific chars needed in this exercise that wouldn't pass that check and also wouldn't kill the readability of the solution. Even my approach is a little bit hacky since I'm including the underscore as part of the alphabet without further explanation.

Since you're looking for those specific strings anyway, it might be worth it to do the reverse check and look for the integer values instead, and using those values explicitly in the warning text. I think the bot already does something similar when checking for snake case or Pascal case. This would have the added benefit of also letting the student figure out it might be a misdetection on the off-chance that one happens.