haskell / cabal

Official upstream development repository for Cabal and cabal-install
https://haskell.org/cabal
Other
1.62k stars 691 forks source link

Strategies for managing Cabal freeze files with platform-specific dependencies #8059

Closed shayneczyzewski closed 2 years ago

shayneczyzewski commented 2 years ago

Hi there! We are working on a project that is making use of a file system watching library called fsnotify that has platform-specific dependencies. If we include fsnotify as a build-depends in our Cabal file and run cabal freeze on Linux, we will see hinotify get resolved as a dependency and used in the freeze file. If we then cabal freeze again on Mac, it will remove hinotify and instead it resolves the dependency to hfsevents in the freeze file. So we are kind of stuck on how to proceed with our Cabal freeze file since each platform will undo the work of the other when resolving dependencies.

TLDR: how can devs handle platform-specific resolved dependencies in a way that still allows them to use a freeze file effectively with Cabal? We would like our freeze file committed to our repo, but in a way where cabal freeze on different platforms plays nicely together. Are there any ways around that for this scenario? Thanks so much for any tips!

shayneczyzewski commented 2 years ago

I should also note we cannot explicitly include both dependencies either to avoid the overwriting, as hinotify will not build on macOS due to a Missing dependency on a foreign library.

Mikolaj commented 2 years ago

Hi Shayne! Define dummy packages or libraries somewhere so that you have the same set on both OSes?

shayneczyzewski commented 2 years ago

Hi @Mikolaj thanks for the fast reply and idea! :D I'm much newer to Haskell/Cabal than my teammates, so I will let them know about the idea as I am having a bit of a hard time imagining how to structure it at the moment, but let me see if I am on the right path.

So the idea is I would have a local dummy package/lib version of hinotify I could pick up on my Mac, but on Linux they would use the real version, so we would conditionally resolve based on platform. And similarly we have a dummy hfsevents for them to use on Linux, while I use the real one. And then we explicitly include them both as dependencies, but with os checks in our Cabal file determining if we use the dummy or real versions, and as long as the versions matched our freeze output should be identical? Sound like the direction of the idea? Thanks for confirming!

Mikolaj commented 2 years ago

I'm sure your teammates have thought about something like this. So the real input from me is that it's possible there is no easier way. However, to make sure, please ask on some Haskell channels, whether IRC, Matrix, discourse, Discord, etc. I'm sure some people have been in such a situation. I haven't. There may even be a ready Haskell package or C library somewhere that uses hinotify or hfsevents depending on a local environment. I haven't googled.

shayneczyzewski commented 2 years ago

Sound good, will do and thanks for your note @Mikolaj. I will report back if there is something not covered worth noting after more outreach. Appreciate it

fgaz commented 2 years ago

Another option is having platform-specific freeze files (--project-file affects the freeze file name too).

with conditionals and imports in project files (#7783), this could become pretty simple and clean

Martinsos commented 2 years ago

@Mikolaj teammate here :D, and while I am the most experienced with Cabal in our team (and I am also not very experienced :D), I do have to admit I don't have a good idea how to approach this, not have thought of dummy packages.

We can give the dummy package approach a try, it sounds interesting, but it is a hack at the end, right? I was hoping we might find a way on how to do it in a more "supported" way. What are your thoughts @Mikolaj, is this something that Cabal freeze should possibly support directly, if not now then at some point in the future?

Would it maybe make sense to have two separate freeze files -> one for Linux, and one for Mac? And then somehow having Cabal know which freeze file to read and also which one to generate based on the operating system it runs on?

I now see answer by @fgaz , that sounds like it goes in this direction -> thanks, we will explore that more!

Martinsos commented 2 years ago

@fgaz if we use --project-file, we could go with having cabal-linux.project and cabal-osx.project, and then also corresponding cabal-linux.project.freeze and cabal-osx.project.freeze files.

However, cabal-linux.project and cabal-osx.project will then be completely identical -> is there any way to avoid that duplication?

Also, how does cabal know which set of files to use? I see that --project-file can't be specified via a cabal.project file. So that means that it has to be specified on the CLI? If so, this means we need to attach this to every Cabal command we issue?

Martinsos commented 2 years ago

I read now about import, so I guess we can use that to reduce/remove duplication between cabal-linux.project and cabal-osx.project?

However, that still leaves the question as to how should we specify --project-file.

It would be ideal if we could have just one cabal.project which then has conditional which specifies the name of freeze file based on os -> bu we can't do that, can we? I haven't found an option for specifying the name of just freeze file.

Martinsos commented 2 years ago

There is another problem that still remains with this approach where we have two sets of freeze files -> on Linux machine, I won't be able to generate new freeze file for Osx, if I want to change some dependencies! And vice versa. Hm, that doesn't sound great, I don't see how we can go around that?

fgaz commented 2 years ago

Yes I think you have to specify it on the CLI. Also relevant: #7367

Mikolaj commented 2 years ago

We can give the dummy package approach a try, it sounds interesting, but it is a hack at the end, right?

It's such an enshrined and venerable hack in software packaging that I wouldn't worry. However, if we can use and document what's already in cabal, or complete and round up some functionality in cabal to make it expressible explicitly, all the better. Having many ways to do the same thing is what Haskellers like.

gbaz commented 2 years ago

For your specific case, one might simply write a cross-platform wrapper for the notify functionality you need, and not freeze that :-)

In the future, this PR should allow both conditionals (including on os) and imports (and conditional guarded imports) in project files: https://github.com/haskell/cabal/pull/7783

That said, it will still be the case that if you want a freeze file for a specific platform, you will need to freeze on that platform.

jneira commented 2 years ago

An alternaive to freeze files could be frozen only the hackage index, using index-state in the cabal.project. It is not as stronger as a full freeze file which also includes a frozen index-state but it could be stronger enough for your use case. You dont get new versions nor revisions and if you dont touch .cabal files there are not much other things which can affect the build (but they exist of course, flags can dance, but most timesdepending on your workflow you want them to dance).

We are doing it in hls and it has worked surprisingly well for now:

https://github.com/haskell/haskell-language-server/blob/a538641bf76ead5bc24f19926d259b67e4aa9c01/cabal.project#L45

faassen commented 2 years ago

For your specific case, one might simply write a cross-platform wrapper for the notify functionality you need, and not freeze that :-)

fsnotify is already that cross-platform wrapper. Not freezing it would be a possible solution, but how can we specify that a specific package and its dependencies should not be frozen at all?

on Linux machine, I won't be able to generate new freeze file for Osx,

That's why I think OS specific freeze files defeats the point of freeze files where I specify a known set of working versions for all the other developers to use.

Martinsos commented 2 years ago

@jneira thanks for the advice, as things are looking right now, it might make sense for us to ditch freeze file and only pin down only the index state! And then if we figure out that is not good enough, we can explore further. Regarding that -> I have to admit I don't yet know if we want to have pinned down or free flags, I probably don't know enough about the use cases to make that decision. I would be interested to understand at one point what is the comparison between stack/stackage vs pinnig down index state vs using freeze file -> I believe both stack/stackage and freeze file pin down the flags also, while pinning down index state is similar but doesn't pin down the flags?

Martinsos commented 2 years ago

As for the cabal freeze challenges: looking at different issues -> different GHC versions (https://github.com/haskell/cabal/issues/7367), different OS-es, it seems to me the proper solution might be in the direction of having per-environment (where environment is defined as combo of os, ghc and arch) freeze files. Meaning that user could define environments it cares about, and then on cabal freeze, freeze file for each of those environments would be generated (is that possible? Can we generate freeze file for osx while on linux -> I guess in theory we should be able to, right? I am not sure though if current Cabal code is close to supporting that? Or is there a theoretical restriction for this?). Additionally, cabal would use correct freeze file based on the current environment it executes on. Would this make any sense as a solution?

shayneczyzewski commented 2 years ago

Really appreciate all the timely advice and back-and-forth on this question everyone, thanks again! 🙏🏻

jneira commented 2 years ago

We are doing it in hls and it has worked surprisingly well until now:

Want to note we considered use cabal freeze files but multi ghc and windows support in hie/hls made it unreliable: https://github.com/haskell/haskell-ide-engine/pull/1561

jneira commented 2 years ago

Can we generate freeze file for osx while on linux -> I guess in theory we should be able to, right? I am not sure though if current Cabal code is close to supporting that? Or is there a theoretical restriction for this?). Additionally, cabal would use correct freeze file based on the current environment it executes on. Would this make any sense as a solution?

and windows too :wink: ? well you could run the solver imposting another os to trigger the appropiate conditions in all the packages involved. But afaik the solver takes in account thing like native libs and pkg-config. Not an expert on the solver component but it does not sound like something easy to do.

gbaz commented 2 years ago

"the point of freeze files where I specify a known set of working versions for all the other developers to use." -- note that this is not the point of freeze files as I understand it in cabal. This is precisely because the known set of working versions will vary by platform and compiler, etc.

In my view, a freeze file is intended for a single developer to pin down locally what they are doing, or to match between a developer machine and a build box, etc.

To pin down the general space of known working versions is the job of version bounds in the cabal file.

Regarding "fsnotify is already that cross-platform wrapper. Not freezing it would be a possible solution, but how can we specify that a specific package and its dependencies should not be frozen at all?" that's a good point. But I'm now confused. The freeze file just offers constraints, it does not force usage. So if you have a freeze file which includes hinofity, even though that is linux only, it should just get ignored on mac, which is fine? And vice versa for hfsevents?

In which case it sounds like on either platform, you produce a perfectly valid cross platform freeze file which just so happens to not pin down one or another platform specific lib for the other platform?

If that's all that the issue is, I personally would choose to not care. Or, I would create a freeze file per platform in the checked-in repo, and let users locally chose to alias to whichever file they desired.

jneira commented 2 years ago

well you could run the solver imposting another os to trigger the appropiate conditions in all the packages involved. But afaik the solver takes in account thing like native libs and pkg-config. Not an expert on the solver component but it does not sound like something easy to do.

Thinking it twice, maybe it would worth explore that path: cross-solving. Windows also supports pkgconfig and native libs via msys2. As long as each os dependent package is only buildable for such os, dependenciaes could be added with no conditionals. For example you can add Win32 to a stack.yaml and that configuration is usable in all os's. Otoh exclusive flags or dependencies buildables for all os's but not working (or not performant, etc) in all of them would need conditionals.

Martinsos commented 2 years ago

In my view, a freeze file is intended for a single developer to pin down locally what they are doing, or to match between a developer machine and a build box, etc.

I wasn't aware of that! Since the only place where I previously encountered the idea of lock file was npm and its package.lock.json, I thought the purpose of cabal freeze is the same as package.lock.json. At https://docs.npmjs.com/cli/v8/configuring-npm/package-lock-json, they state that one of the purposes of package.lock.json is "Describe a single representation of a dependency tree such that teammates, deployments, and continuous integration are guaranteed to install exactly the same dependencies.". So for package.lock.json, purpose is certainly to ensure consistent builds across the whole team and multiple machines.

You said you think purpose of cabal freeze is for:

  1. Single dev to freeze the dependencies.
  2. To freeze dependencies between dev machine and build box.

For (1), it sounds like freeze file is then not to be used in team setting? From what I understood so far, there is only one freeze file intended per cabal project, not one per team member? As for (2) -> that sounds reasonable. But what if I am building for multiple environments? Let's say for both Linux and Osx and Windows? What if my teammate needs to debug the latest build that was released, play with that version of the code -> how can they be guaranteed they're using same set of packages as was used in the build, if freeze file is not shared among team members?

To pin down the general space of known working versions is the job of version bounds in the cabal file.

Sure, although I think we have to emphasize it is not "known" working versions, it is what we think are working versions. There are just so many combination of versions that could get picked based on those constraints, that normally in practice you can't really ensure all of those are certainly known to be working together. That is why there is value in pinning down / freezing exact versions of packages. Cabal file says "these versions should all work", but freeze file says "this I actually tried/tested and it worked". Not to mention that in cabal file we don't specify versions of dependencies of dependencies (and so on)! So those can change underneath us, based on how packages we use defined version bounds for their dependencies, which we don't have control over.

If that's all that the issue is, I personally would choose to not care. Or, I would create a freeze file per platform in the checked-in repo, and let users locally chose to alias to whichever file they desired.

Would you mind explaining this further? How would you let users alis these files, via which mechanism? How would you ensure that in CI, the correct freeze file is used?

Martinsos commented 2 years ago

Can we generate freeze file for osx while on linux -> I guess in theory we should be able to, right? I am not sure though if current Cabal code is close to supporting that? Or is there a theoretical restriction for this?). Additionally, cabal would use correct freeze file based on the current environment it executes on. Would this make any sense as a solution?

and windows too wink ? well you could run the solver imposting another os to trigger the appropiate conditions in all the packages involved. But afaik the solver takes in account thing like native libs and pkg-config. Not an expert on the solver component but it does not sound like something easy to do.

Yes, windows too probably!

Aha, so I wasn't aware that resolver takes in account thing like native libs and pkg-config -> is that due to figuring out those automatic flags? I imagined that resolver doesn't care about the native system, except for the type of it in order to trigger right conditionals. While type could easily be simulated ("hey, resolve now as if you were on osx"), it does sound much harder to do cross-resolution if more information is needed from the system.

jneira commented 2 years ago

Aha, so I wasn't aware that resolver takes in account thing like native libs and pkg-config -> is that due to figuring out those automatic flags? I imagined that resolver doesn't care about the native system, except for the type of it in order to trigger right conditionals. While type could easily be simulated ("hey, resolve now as if you were on osx"), it does sound much harder to do cross-resolution if more information is needed from the system.

yeah, see my last comment about: https://github.com/haskell/cabal/issues/8059#issuecomment-1077293138

michaelpj commented 2 years ago

We've come across the more general case of this too: the build plans for the same package on different platforms can be arbitrarily different. This is because even if all you do is add a dependency on package foo on MacOS, say, foo can have version constraints that cascade and cause the solver to pick different versions of arbitrarily many packages.

Since the only place where I previously encountered the idea of lock file was npm and its package.lock.json, I thought the purpose of cabal freeze is the same as package.lock.json. At https://docs.npmjs.com/cli/v8/configuring-npm/package-lock-json, they state that one of the purposes of package.lock.json is "Describe a single representation of a dependency tree such that teammates, deployments, and continuous integration are guaranteed to install exactly the same dependencies.". So for package.lock.json, purpose is certainly to ensure consistent builds across the whole team and multiple machines.

I hate to break it to you, but package.lock.json has exactly the same issue! Hilariously, it's even also relating to fsevents.

What if my teammate needs to debug the latest build that was released, play with that version of the code -> how can they be guaranteed they're using same set of packages as was used in the build, if freeze file is not shared among team members?

As jneira says, this is what pinning index-state gets you: that guarantees that the solver will pick the same things as it picked for you.

... unless they're on a different platform, in which case things can be arbitrarily different.

I do agree that freeze files are confusing - I also thought they would achieve this and was wrong.

Martinsos commented 2 years ago

@michaelpj thanks for sharing this!

Interesting that this also happens for package.lock.json, I somehow naively assumed they don't have that issue! I see multiple solutions proposed there, also including platform specific lock files, or just putting it all into one lock file but with additional semantics that describe which package version is for which platform. Also some kind of optional packages, then some more advanced ideas, ... . They have claimed to solve it with https://blog.npmjs.org/post/167963735925/v560-2017-11-27 (Fully cross-platform package-lock.json. Installing a failing optional dependency on one platform no longer removes it from the dependency tree, meaning that package-lock.json should now be generated consistently across platforms!) but it seems problem still exists. It is interesting that people claim that yarn doesn't have that issue though. I wonder how they solve it -> I didn't yet have time to investigate. They do have this piece of docs https://classic.yarnpkg.com/blog/2016/11/24/lockfiles-for-all/ that seems very nice and practical, I plan to give that a read (EDIT: read it, interesting read in general on lock files problematic, but doesn't offer solutions for the problem we are discussing here).

I am ok with different platform providing different resolution -> I don't think that can be avoided, it is a feature at the end, that we can have per-platform conditions in cabal. So solution with index-state does sound pretty good!

Ok, I think this resolves this issue mostly, with the answer being: freeze files are not to be used cross-platform, instead use index-state for reproducibility. This issue also does open question of: Can we make freeze files work cross-platform? It seems like we could, and there are solutions and attempts out there (npm, yarn), but I guess the question is also, is it worth doing this, if we have solution with index-state?

All together, should I close this issue, or leave it open to encourage further exploring these additional questions?

jneira commented 2 years ago

I am ok with different platform providing different resolution -> I don't think that can be avoided, it is a feature at the end, that we can have per-platform conditions in cabal.

To make things funnier, we already have conditionals on platform in cabal.project for master. So for master and 3.8, you could use a diff freeze file per platform and ghc :slightly_smiling_face: https://github.com/haskell/cabal/pull/7783

It has to be done manually or using conventions, generating the freeze file in each target platform so maybe a thing to do in ci

shayneczyzewski commented 2 years ago

Thanks for all the input here, everyone! I'm going to close this issue since I think it captures enough information to help others. Take care 👍🏻