Open masak opened 5 years ago
While not true for the conscious-risks ecosystem, in general dependencies can have dependencies. Only the direct dependencies will end up under depends-on
in package.yaml
, but all of the transitive dependencies will end up in .dep-cache
.
So, for a given project we're really looking at a dependency tree. No, wait, a dependency DAG; two projects somewhere in the DAG can very well depend on exactly the same URI-and-SHA1. (The .dep-cache
directory should probably be SHA1s on the first level down, and then just copies of the inside of lib/
directories.)
From this, I think it's even fine for different parts of the dependency DAG to pull in the same project at different SHA1s. That should just transparently work.
What's not OK is cycles. Under reasonable assumptions, SHA1s make sure that things are "well-founded" and don't refer to each other cyclically forever. Maybe that's good enough. What I mean by "reasonable assumptions" is that someone might put in the work and compute a special pair of SHA1s of projects that could refer to each other. In that case, I almost feel they deserve whatever error message we put in for that scenario.
There's still the question of projects referring to each other cyclically when SHA1s are discounted. This, I think, we could detect — unfortunately not at 007-dep add
time because at that point we don't know the URI of the current project. My feeling is that this will be very rare in practice, though, so I'm fine with not fretting about it so much. Again, in any case, the SHA1s are guaranteed not to be cyclical.
I guess there would be technical difficulties related to the following proposal, and I havn't thought about the details. So take it for what it is, a random thought. And please excuse me, if this would not be the place for that kind of activities.
One solution could perhaps be to make use of a separate repository, a library of a sort. If you don't plan to scale up your ambitions with 007 (an idea which I, as you might have gathered from mail conversations, would love, but I'm not, on the other hand, the one that would have to do actual work :)), it would be possible in practice? That would include the more narrow, partly (or wholly) wrongful view on what a module is. But in a sense, does it really matter what you call things. I am thinking in terms of a .h file. And also a .h file that would meaningful to share with others. This would mean that the header and the box module would be one kind of module (because other people perhaps would want to use them in their own projects), while the game also would be a module but not of another kind; to more precise, the kind of module that other people would use, but not re-use in their own projects. This would, I think, limit the amount of projects included this library-repository; perhaps this repository could be more anarchistic in that case, use a even more free license (a no rules license, beside the rule no rules-ish/MIT).
As a developer you naturally would want to make your own different modules locally, just as you intend. And you'd also want to make use of other peoples modules without being forced to clone their projects and extract the function you need. By use of a anarchistic 'public' library of this sort, the need for a package manager would disappear and you and other people in the 007-team could concentrate on other tasks.
(I know that you can make 'private' (not the keyword) .h files in C/C++, but you get my point... If it is an idea that is good or not, that I don't know.)
Just a thought. :)
But in a sense, does it really matter what you call things.
Careless phrasing on your part, perhaps, but... yes, it matters? 😄
\<romeo> A rose by any other name would smell just as sweet. \<bart> Not if you called it "stinkflower"!
Or, to be more precise, welcome to 007. Here, it really matters what you call things.
I am thinking in terms of a .h file.
Modula-2 (one of the first languages to implement modules) also makes a split between "interface" and "implementation", the way .h
files do. To me, the interface is declared implicitly, or at least very much inline, by the export
statements in a module file. I consider that a feature (and one I don't take credit for; JavaScript does the same) — things can't grow inconsistent if they're only one declaration instead of two files with partially repeated declarations.
And you'd also want to make use of other peoples modules without being forced to clone their projects and extract the function you need.
I think we're on the same page here. That's what I'm trying to do with my musings in this issue — allowing dependencies between projects/modules without (or "rather than") copy-paste.
By use of a anarchistic 'public' library of this sort, the need for a package manager would disappear and you and other people in the 007-team could concentrate on other tasks.
I'm reminded of the "monorepo" structure some projects have chosen. (Though that'd be a single repository, not two.)
I dunno. I think your proposal might solve some problems and cause others. Actually, there's nothing stopping anyone from creating a large repository of everyone's modules like that. But I think it loses a thing I didn't point out above: the project/repository as the boundary of updates/releases. I might expand on that at some point.
Also, while I don't have any illusions 007 will ever grow a sizable community, I have very mixed feelings about shutting such a community into a single repo and telling it to play there.
Please don't consider my answer final. :smile: I'm still mulling over these things.
About the word part. Yes, it totally agree with you. It WAS a very careless, clumsy formulation. Words matters. But in this case, I still think it's a point to this. Partly only partly a point, it only because I lack the accurate terminology I don't know what to call it. And I actually think that because of what you said the other day (that's why I wrote like that), 'module' is not the word I'm looking for, since also a game would a module. :)
I have very mixed feelings about shutting such a community into a single repo and telling it to play there.
I understand that. But I didn't mean this to be the only way, more of A way to handle the situation. No one would stop anyone from creating another lib-repository. I think it would be quite handy to collect all modules in one place, as long as one could choose what to actually include in her/his project.
But you know what, you've already convinced me this is a bad idea. Now I know what I think; I was a bit ambivalent and what way are better then 'testing' your thought on someone else. And you don't seem to mind spar-n-correct
I am thinking in terms of a .h file.
By coincidence I ran into this criticism of C# C++ compile speeds. Yes, part of the reason is that C++ encourages putting too much in its .h files — not just interface details, but implementation, too. Thus things need to recompile too often.
Funnily enough, that section ends with the sentence "One suggested solution is to use a module system".
This is also touched upon in the outstanding C++ FQA.
That « C# » probably doesn’t want to be here
Indeed. Fixed; thank you.
Coming back to this one, and thinking about ergonomics:
Let's say I want to add the two dependencies
ascii.header
andboxify
to myconscious-risks
game. I'd issue these two commands:$ 007-dep add https://github.com/claes-magnus/007-ascii-header-printer/ ascii.header $ 007-dep add https://github.com/claes-magnus/007-boxify boxify
I'm very tempted to go with @claes-magnus's idea of having a library repository, except (a) not as a thing separate from 007/Alma itself, and (b) only listing names of third-party dependencies, linking them to URLs.
That is, in the case of the above invocation, I'd be able to get away with
$ 007-dep add ascii.header
$ 007-dep add boxify
which is of course a lot nicer.
An extra level of nicety would be for users to be able to easily have additional (third-party) lists, somehow. But that doesn't have to be in a minimum viable implementation.
I want to add this discursive post about building package managers to this issue. I've skimmed it; need to go back and read it more carefully (and then maybe write a thoughtful summary here). I found it in one of rsc's articles about Go package management.
There's also this blog post praising the tip-of-the-iceberg utter simplicity of go run main.go
. I want to take something away from that which can be easily summarized. Maybe it's simply that, if you do your build system right, including package management and reproducible builds (as Go does), then the equivalent of go run main.go
is the sweet, sweet payoff for you and all of your users.
I was re-reading this ticket and "Reason for Modules" recently, as I added exports/imports/modules to a toy Lisp I have on the side. Since I did the simplest thing that could possibly work, I didn't even try to consider how I'd build a package manager. But it's something that was in the back of my head. I think for the most part, you can start (and stay) with an existing option. PureScript used to use bower, and now uses npm, both of which were targeted for the JS ecosystem.
I think for the most part, you can start (and stay) with an existing option. PureScript used to use bower, and now uses npm, both of which were targeted for the JS ecosystem.
That is a good point. Using something existing is good not just because of the decreased workload, but it also creates an affinity with something already existing.
After recently seeing Herb Sutter praise backwards compatibility to the skies, it has been on my mind that taking "uncompromising interop" (with something, C or Java or JavaScript or Raku) is a really good idea, or at least something to seriously consider. It the kind of design thing that has to be done from day 1, and can't be bolted on later. But Alma's design was never that beholden to anyone or anything, and it's never too late to have a better day 1 if we want.
This post with an overview of Python environment management and packaging tools, makes me think. I guess the question for now is "how many of those five circles ought one design in from the start?".
For now, I have no simple answer. Need to think.
In this discursive post defending Rust productivity, I found a compelling argument for Rust's smaller "module" level and bigger "crate" level:
Rust is one of the few languages which has first-class concept of libraries. Rust code is organized on two levels:
- as a tree of inter-dependent modules inside a crate
- and as a directed acyclic graph of crates
Cyclic dependencies are allowed between the modules, but not between the crates. Crates are units of reuse and privacy: only crate's public API matters, and it is crystal clear what crate's public API is. Moreover, crates are anonymous, so you don’t get name conflicts and dependency hell when mixing several versions of the same crate in a single crate graph.
This makes it very easy to make two pieces of code not depend on each other (non-dependencies are the essence of modularity): just put them in separate crates. During code review, only changes to Cargo.tomls need to be monitored carefully.
There's a risk that when we've implemented modules — that is, being able to import a file from another through the
import
statement — there's a whole part missing in order to be able to snap together the three modules in the conscious-risks ecosystem ([1] [2] [3]), and we all go "OK, now what". This issue is meant to look one corner ahead and address that.There are three common meanings flying around for modules, all of which become relevant to 007 users at one point or other:
An object with a namespace in it. "That thing you import from the file." When we do
import m from some.module
, the variablem
ends up being a subtype ofModule
.A source file. "That file you import things from." Modules and files are basically identified, and so this is the "standard" meaning in some sense. The file contains
export
statements, and those are the names we can import somewhere else.A package/project/repository that can be installed. Think CPAN/pypi/npm. CPAN calls this level "distribution". npm talks about a "package". I think pypi calls this a "release".
That last level is a bit hidden to people. It's more like "I just want to install this module". It's one of those things (kinda like with block scopes vs stack frames) where we might be doing best in upholding the illusion of simplicity for the user and going "yes indeed, you want a module — fine", even though they are something that contains modules, not modules themselves.
On the other hand, I really really really don't want to build another package installer. It's supposed to be extremely hard to get right, and there's very little in there that would benefit 007 the language-experiment-for-doing-structured-language-extension.
I want to walk a delicate balance between the simplest thing that could possibly work and aargh no please not another package manager.
Here's my proposal, taken for the particular example of the conscious-risks ecosystem.
Let's say I want to add the two dependencies
ascii.header
andboxify
to myconscious-risks
game. I'd issue these two commands:Both these commands leave zero output if all goes well, but they create or change a
project.yaml
file to look something like this:(That is correct YAML. I checked. I'm wary of adopting this format, but it's also quite clean, and we could mandate using only the subset we see above: nested dicts with strings. sYAML.)
Long story short, when we later run
risks.007
and it contains, say, an import statement pulling inascii.header
, the 007 runtime will know not to look among the project's own modules, because there's a declaration inpackage.yaml
saying which repository, which Git revision, and which file path to go to in order to find that module file.Note first that we're thereby taking on a kind of dependency on Git. Not Github particularly, but on Git. (A project on GitLab, or even something hosted on someone's server, should work just fine.) I'm OK with that. The way to lock onto a SHA1 is my way of completely sidestepping versions and semver and whatnot. It's not super-elegant, but it's the kind of extreme simplicity I'm looking for.
Note second that going to fetch those dependencies should happen as we run the
007-dep add
commands. We can't have it happen every time we runrisks.007
... so we cache the result in a hidden directory called.dep-cache
. Like withnode_modules
, you're meant togit ignore
this directory. As you're also supposed to commit and push yourpackage.yaml
file, when your colleagues (or whatever) download your repo, they have to007-dep install
all the third-party stuff. A suitable error message when.dep-cache
doesn't exist or doesn't contain things declared inpackage.yaml
will push people in the right direction if they accidentally try to compile something with a missing third-party dependency.I think that's it. The two depended-on modules in the conscious-risks ecosystem would in time need to (a) move things into the
lib/
directory and rename files correctly, and (b) get aproject.yaml
file of their own. But that's a fair price to pay, in my opinion. I can make the appropriate PRs for that after we have things well-tested on the 007 end of things.The
provides
field would be used by theascii.header
andboxify
packages to expose their respective modules. The last argument of007-dep add
could be omitted if there's only one module in that field. I think there should be a007-dep init
command to help create thepackage.yaml
file — because, as usual, life is to short to hand-write YAML.I'm almost a little pleased that this scheme avoids an npm-like central package authority. Instead we rely on Github for that, or rather, on URIs. There's no way to steal the name
boxify
for ever and ever.There's supposed to be a
007-dep remove
command, not described here. Presumably there could also be a007-dep update
command, for those brave enough to update to a dependency's latest commit.