haskell-nix / hnix

A Haskell re-implementation of the Nix expression language
https://hackage.haskell.org/package/hnix
BSD 3-Clause "New" or "Revised" License
759 stars 115 forks source link

Please, migrate FilePaths (String) to Path (from path) #912

Open Anton-Latukha opened 3 years ago

Anton-Latukha commented 3 years ago

During (String -> Text) migration, inevitably the question is what to do with the FilePath arose.

What I've thought to do, or to have type PathText = Text, or to have an equivalent situation for it, but in Text.

Of course the POSIX libs & base below are String anyway the case.

The author of the path package and Path data type is Snoyman, who with hvr & Neil Mitchel are coauthors of the Abstract FilePath Proposal (AFPP).

Overall the path is one of the currently frequently used packages.

So in the face of a new base being done, it seems like a good strategy to start doing work now to migrate to FilePath to Path, because we would receive exactly Path or very close equivalent to it in the new base.

Since currently the:

newtype Path b t = Path FilePath

Both the data type logic should respect it as a path, and casting to Text is possible.

Because currently in the code in some places the boundary between FilePath and String does not exist, some functions that are to accept only FilePath call it String.

Migration to newtype Path b t = Path FilePath should make logic more robust and establish the proper type and type casting boundary that would allow further easy migrations for ways the path data types would go. And casting it into text is just as possible.

Currently I just want to finish the (String -> Text) and would go around the FilePath for the time.

Anton-Latukha commented 3 years ago

Ok.

I've just done the pass over the source code.

From what is left after migration, the most internal String and toText _ is FilePath. And there was no type boundary between FilePath and String and after I (String -> Text) - where the FilePath becomes textual type became even vaguer.

It became obvious that FilePath should be casted to some type that makes the boundary real, so the path handling functions become specified to the path handling, and textual handling to such, and so the safe type casting happens in between.

path also provides functionality to distinguish between path types, which HNix currently does internally.

Overall, this is the next move in the transition from String in the HNix.

sternenseemann commented 3 years ago

Note that this still has the somewhat problematic assumption that all Paths are valid Unicode in some capacity. This is not correct for POSIX as paths are arbitrary byte sequences that don't require any kind of encoding. This would be an issue with nix in some (probably) corner cases since Nix strings are also just arbitrary byte sequences without any requirements in terms of encoding.

Using filepath-bytestring could be a way forwad to aleviate this (and move away from String.

Anton-Latukha commented 3 years ago

Yes, it is true.

It is my trait, I kind of love to do/state things gradually. I kind of cringe when people post in report epic far reaching goals, because it is simple to generate a lot of far reaching ideas/goals, and never implement them, and so those entries loom in reports as things that could been but never came true.

Lifting all paths from String, across the projects - is already a task, and needs type plumbing across from the I to the O. It also allows to revisit/readjust a bit a textual use in the project. After the nearest step - myself do not like to state the plans other then one step ahead, because what next step is better seen when previous steps completted. Also douing things in those smaller gradual steps produces flexible code. As it is n-path process, code just gets gradual treatment. It is easier to fix & prettify everything when one walks through the code like a through a garden while meditating, doing some understandable task. Comparing to when people try to do a lot in one jump. It is all not in the context of bytestrings, but generally & I apply it even in such questions.

Thou, you are right, paths can be any encoding, yes filepath-bytestring seems the way. But so far Strings also worked & path is equivalent to it. Also if it would be seen that that paths as bytestrings are a benefitial thing - path type steps better to be done in one release. Someone who would lift the type and do some testing and some research doing it would have idea & probably would mention it in the PRs/reports while doing the work.

Anton-Latukha commented 2 years ago

Currently, the paths are lifted from String into newtype Path & Path type system boundary is mostly properly established over the codebase.