tweag / nickel

Better configuration for less
https://nickel-lang.org/
MIT License
2.23k stars 85 forks source link

Remove different behaviour when single input #1902

Closed olorin37 closed 2 months ago

olorin37 commented 2 months ago

Addresses #1901. Proposition.

It could be so simple, but it seems that cli tests rely on specific behaviour for single file input. I'll think about it later, cause it requires grasping the way tests are made... What do you think? Is it good direction?

vkleen commented 2 months ago

Personally, I like the approach. The extra indirection through an import should be entirely inconsequential. I don't see a good reason why tests couldn't be made to work with this either, but I haven't given it much thought.

olorin37 commented 2 months ago

Surely tests can be aligned, but I had to lest the issue for other duties, for a while.

olorin37 commented 2 months ago

The main problem still remains. Snapshots for cli tests now contains import "FILE" instead actual content...

Some explicit occurences of new_from_file has been removed.

yannham commented 2 months ago

Thanks for tackling this, @olorin37. While it sounded like a quick and efficient solution initially, given the tests and the small annoyances coming from having this additional import indirection, I wonder if we shouldn't take a different path.

Here is the crux: when importing a file, the import resolver (the cache) parse the file according to its extension. So it selects a format based on the file name. While it's a bit fragile, it's currently still very useful and does the job. However, when loading a file without an import, the cache always parses it as Nickel.

I think it doesn't make a lot of sense: if we automatically use the file extension to drive how imported files are parsed, we should probably do the same for the main file as well. I believe it mostly amounts to get rid of all the xxx_multi methods in cache::Cache, or rather, to get rid of the non-multi variants and rename all xxx_multi to just xxx. As an example: get rid of parse and rename parse_multi to just parse. Then fix the callsites that will missing a new additional ImportFormat parameter by providing InputFormat::Nickel.

In a second step, maybe we can add a parse_auto as well, that will have the same signature as the current parse but will determine the format based on the file extension, and default to Nickel. This is exactly what Cache::resolve (used for import resolution) is doing, so this just amounts to extract a small part of it and putting in this new parse_auto function.

The last step is to use this new parse_auto inside cache::Prepare instead of parse, here precisely: https://github.com/tweag/nickel/blob/3b01e62864209734a587728bc8c2c39754f65602/core/src/cache.rs#L905

Overall this change doesn't require any new logic, but just re-organizing the cache::Cache methods. @olorin37, do you feel like tackling this? Of course I understand that you might not have the time or the additional motivation :slightly_smiling_face:

olorin37 commented 2 months ago

Surly I'll try to grasp it, but I am not sure if I'll manage, cause of time:) If not I'll inform you... but first I'll try to do it.

yannham commented 2 months ago

Great! If you think that would help at some point, we can even schedule a little pair programming session, as it seems we're in the same timezone. Let me know (e.g. on Discord)!

olorin37 commented 2 months ago

After the re-organization I got exactly the same problems:

━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ Snapshot Summary ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
Snapshot file: cli/tests/snapshot/snapshots/snapshot__pprint-ast_stdout_simple_record.ncl.snap
Snapshot: pprint-ast_stdout_simple_record.ncl
Source: cli/tests/snapshot/main.rs:59
──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────
Expression: out
──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────
-old snapshot
+new results
────────────┬─────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────
    0       │-{
    1       │-  aaaaa | Number | default = 1,
    2       │-  bbbbb
    3       │-    : String
    4       │-    | force
    5       │-    = "some long string that goes past the 80 character line limit for pretty printing",
    6       │-  ccccc : { x : Number, y : Number } = { x = 999.8979, y = 500, },
    7       │-  ddddd
    8       │-    | Array std.string.NonEmpty
    9       │-    = [ "a", "list", "of", "non", "empty", "strings" ],
   10       │-}
          0 │+import "[INPUTS_PATH]/pretty/simple_record.ncl"
────────────┴─────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────

It seems, first layer of imports should be somehow resolved.

yannham commented 2 months ago

@olorin37 I think you're missing one step, but you're not far. Sorry if that wasn't clear, but my suggestion of going the "get rid of xxx_multi" road is a different direction that your initial take. The idea would thus be to restore the new_from_file specific case for single files, that you originally got rid of. Finally, we need to determine the format from the path automatically in cache::Cache::prepare. More concretely:

  1. Revert 59e6c28260f5f614141aae23284b277e79f440a8

  2. Add this helper method to cache::InputFormat, next to from_path:

    /// Returns an [InputFormat] based on the file extension of a source path.
    pub fn from_source_path(source_path: &SourcePath) -> Option<InputFormat> {
      if let SourcePath::Path(p) = source_path {
        Self::from_path(p)
      }
      else {
        None
      }
    }

    (I didn't check this code, so it might need a few tweaks to make the compiler happy, but to get the idea)

  3. Then, we can finally use that from within cache::Cache::prepare: replace the following line

    https://github.com/tweag/nickel/blob/3b01e62864209734a587728bc8c2c39754f65602/core/src/cache.rs#L905

    by something like:

    let input_format = self.file_paths.get(file_id).and_then(|source_path InputFormat::from_source_path(source_path)).unwrap_or_default();
    if let CacheOp::Done(_) = self.parse(file_id, input_format)? { 

    This should let the cache select the right format automatically now, even for a single program.

I think in the long term, we might want to also do your initial simplification - the two aren't mutually exclusive - but we then need to fix the pretty printer, which is making your tests fail. Some tests are checking that nickel pprint-ast - which parse a file and pretty-print the parsed ast - behave correctly, but now it only sees the root import instead of the actual program. In the meantime, this approach - which is also simplifying the code and make the input format parameter passing more explicit - is probably the fastest way to get there.

olorin37 commented 2 months ago

Right, I missed that you proposal is an alternative to mine. And, yes I actually do what you proposed without understanding how it would help those snapshot tests (with pretty print). So, I'll follow your idea and tips.

"Less is more" removing this one unnecessary case from my initial proposition should be done some day I think, unless it does not complicates things even more 🤔

olorin37 commented 2 months ago

@yannham Thanks for guidance, now it is working. Please, make an review, anything to improve (with this "fast-path" approach)?