exercism / java-analyzer

GNU Affero General Public License v3.0
7 stars 14 forks source link

Analyzer upgrade #39

Closed olapokon closed 6 months ago

olapokon commented 2 years ago

Hello. I am interested in contributing to exercism, and specifically to the analyzers and representers. I am opening a new issue here, as discussed in https://github.com/exercism/java/issues/1969

I looked a bit into the existing analyzer and here are some things I noticed so far.

Regarding the new Analyzer interface:

  1. The CLI interface doesn't seem to conform to that described, as it does not accept an output directory. It writes the analysis.json file to the exercise directory by default.
  2. The "status" field in the output should be removed, as it is no longer needed.

Other things:

  1. It looks like ExerciseRegistry is not populated with the existing exercise classes so it's not possible to use the analyzer through the CLI at all (unless I make a minor change to make sure they load)? It's possible I'm just missing something, but since these classes are put in the registry in a static block, it seems this block is never run because the classes are not loaded, so there is an exception that they are not in the registry. Does anyone know why this may be happening in my case, or if it is a problem in general?

  2. Since the analyzer depends on the comments here https://github.com/exercism/website-copy/tree/47af5b309ac263629ca5c52904046f81e0cc8def/automated-comments/java does that mean that it should only accept exercises for which a directory in automated-comments and a class in java-analyzer exercises package are available? Or is it intended that the analyzer should eventually be refactored to be able to accept all exercises and at least apply the general comments?

jmrunkle commented 2 years ago

It looks like ExerciseRegistry is not populated with the existing exercise classes so it's not possible to use the analyzer through the CLI at all (unless I make a minor change to make sure they load)? It's possible I'm just missing something, but since these classes are put in the registry in a static block, it seems this block is never run because the classes are not loaded, so there is an exception that they are not in the registry. Does anyone know why this may be happening in my case, or if it is a problem in general?

If that is the case, it just means that we broke it while trying to setup the ExerciseRegistry. The goal was to be able to register the exercises without having to continually modify the main class anytime a new exercise is added. If it is not working, by all means go ahead and change it. We used to have a simpler switch-case (and we could always go back to that).

Since the analyzer depends on the comments here https://github.com/exercism/website-copy/tree/47af5b309ac263629ca5c52904046f81e0cc8def/automated-comments/java does that mean that it should only accept exercises for which a directory in automated-comments and a class in java-analyzer exercises package are available? Or is it intended that the analyzer should eventually be refactored to be able to accept all exercises and at least apply the general comments?

We should certainly only accept exercises which have comments, but I believe the infra is setup to just fail if you try to tell the website to use a comment that does not exist (yet). The way this is supposed to work is that you first add / merge a PR for the comments and then you add / merge a PR modifying the analyzer since there is no way to submit a PR that spans two repos.

As far as I can tell, there is no desire or capability of applying just the general comments (at least not without implementing some default analyzer that only looks for some standard set of things). I think it would be very difficult to come up with a standard set of things that could be checked generically by looking at the AST.

Smarticles101 commented 2 years ago

I've opened a PR to revert the ExerciseRegistry. The problem is exactly as you described: static blocks are not being run because the Exercise classes were never loaded into memory.

olapokon commented 2 years ago

Ah ok, so it was a general problem after all.

As far as I can tell, there is no desire or capability of applying just the general comments (at least not without implementing some default analyzer that only looks for some standard set of things).

In that case it seems the analyzer is already in a pretty good shape. I guess most of the work that needs to be done has to do with adding more exercises?

There are also the changes needed for the new interface. Should I start with removing the status and adding a third parameter for the output directory to the CLI, or is someone working on that already?

jmrunkle commented 2 years ago

There are also the changes needed for the new interface. Should I start with removing the status and adding a third parameter for the output directory to the CLI, or is someone working on that already?

As far as any of us know, there is no one else actively working on any PRs here. I have a super-old PR where I was setting up automated analyses for Gigasecond, but it is pretty stale now since I haven't touched it in years. Therefore, feel free to start working on those things. Perhaps create a top-level tracker issue for the changes you need to make so that it is clear what you are doing, but I would not worry too much about conflicting with someone else since I doubt there is any other outstanding work. Another option is to draft the PR quickly (not implemented, just to claim the work).

olapokon commented 2 years ago

Thanks, I will make a new issue then, starting with the third CLI parameter.

mirkoperillo commented 2 years ago

@olapokon your PR #43 is finally merged. What's about the next steps ?

olapokon commented 2 years ago

@mirkoperillo The main things mentioned above (the output directory parameter and removing "status") are done, so I was thinking of doing some refactoring, if there aren't any objections.

As it is now, if there are no params, the comment is a single string. e.g.

{
   "comments":[
      "java.hamming.avoid_character_literals"
   ]
}

If there are params, it is an object with a comment field which has the comment key, and a params field. e.g.

{
   "comments":[
      {
         "comment":"java.general.constructor_too_long",
         "params":{
            "constructorNames":"Hamming"
         }
      }
   ]
}

Should it be changed so that it is always an object with a params field that may be empty (or missing)? Or is it this way on purpose?

A separate reader and writer could be created, so that the Exercise class and subclasses are decoupled from reading and writing. They would only receive the text and return the analysis json.

Could also add an enum for the exercises and do some other minor refactoring along the way.

mirkoperillo commented 2 years ago

Hi @olapokon

Should it be changed so that it is always an object with a params field that may be empty (or missing)? Or is it this way on purpose?

Better to maintain coherency. Always return an object with an empty params (if params is not needed)

olapokon commented 2 years ago

Hi, I added a commit to the pull request to always have params.

sanderploegsma commented 6 months ago

Closing this as I believe everything that this issue addresses has been implemented.