[RFC] [WIP]: Proposed file reorg

Stebalien commented 9 months ago

Use XCB definitions from /usr/share/xcb by default.
Move distributed files into a lisp subdirectory.
Make a test directory.
Rename el_client.el to xelb_gen.el.

That way we can add tests without distributing them, and we'll stoop distributing the code gen script as well. Thoughts?

If this is OK, I'd like to merge #4 without tests, re-do this PR, then add the tests.

minad commented 9 months ago

Looks good to me. Does this require changes in the ELPA registry? If yes we should synchronize with Stefan Monnier.

EDIT: Changes are needed. Check out the elpa-packages file in the elpa repository.

minad commented 9 months ago

Thinking about this more - we don't need to restructure the directory if all we want is to ignore tests. We can add them to elpaignore. Personally I maintain a flat directory hierarchy for my packages, so I am not sure regarding the advantages of a restructuring.

Stebalien commented 9 months ago

Easier to distinguish between tests and other files.
Better compatibility with source distribution (e.g., package-vc). Otherwise, I'll have to add the root project to my load-path which will add all the tests, code gen, etc.

minad commented 9 months ago

I see. For me a nested hierarchy is slightly more complicated, we have to adjust elpa-packages etc. What I want to say - there are also complications and not only advantages. For me, introducing a hierarchy doesn't seem urgent or that important, given that the root directory is not very cluttered. Furthermore I like to avoid superfluous changes - it is better to somewhat maintain current conventions as long as we don't hit limitations, but maybe we already do?

minad commented 9 months ago

I think it depends on how we are going to proceed with the tests. Will there be a single xelb-tests.el (my preference) or tens of *-test.el files? If we will only have xelb-gen.el and xelb-tests.el, introducing a hierarchy doesn't seem justified.

Stebalien commented 9 months ago

I don't think we'll need more than one, may be two here. My main concern is installing from source.

What if I put just the tests in a subdirectory? That would keep them out of the load-path.

minad commented 9 months ago

If we have only a single or two test files, it seems artificial to create a subdirectory. Having the test file on the load path won't hurt if you use something like package-vc. This makes it even easier to load it and run tests (which you want if you mix development and installation). For Compat we have the exact same flat layout and it hasn't led to problems so far.

If for some reason, you want to have a cleaner installation, then my suggestion is to not use package-vc. The problem is that package-vc conflates development and installation. This creates clutter both ways. Your development directory contains compiled files and your installation directory contains possibly unwanted development files.

If I recall correctly, there has been a discussion on emacs-devel where people wanted to install from a separate git repository. This is what I am doing. I have a simple command which let's me select from my directories in ~/projects/elisp/, make a package out of it and install it.

medranocalvo commented 9 months ago

For my part I'm OK with what you both decide.

For me the biggest issue is differentiating the xcb-*.el files that are generated from those that are written. That is, I would find it an improvement to have a lisp and say gen folders. But it's not a big deal and I don't think it's justified.

Use XCB definitions from /usr/share/xcb by default.

I think that's an improvement, as a first-timer won't have to first download xcb-proto to the right place. If possible, include a comment pointing to the xcb-proto repository as a hint for someone wanting to generate a newer version. E.g.:

# To generate from a newer protocol version:
# 1. git clone https://gitlab.freedesktop.org/xorg/proto/xcbproto ../xcb-proto
# 2. Set PROTO_PATH =: ../xcb-proto/src
PROTO_PATH := /usr/share/xcb

That way we can add tests without distributing them, and we'll stoop distributing the code gen script as well. Thoughts?

I guess that's OK; users don't need it.

If this is OK, I'd like to merge #4 without tests, re-do this PR, then add the tests.

I didn't see tests in #4.

Stebalien commented 9 months ago

Tests: https://github.com/emacs-exwm/xelb/pull/4#issuecomment-1891173181 (ish).

Having the test file on the load path won't hurt if you use something like package-vc.

Including random files can cause byte compiler warnings. Although, we can (and should) disable byte compilation for the test/build files.

On the other hand, I'm kind of confused why you're so keen on keeping a flat layout, most complex lisp projects have multiple levels of directories: magit (lisp & test), ement (test), pdf-tools (lisp & test), forge (lisp & test), plz (test), evil (scripts dir).

Yes, package-vc isn't perfect, but it works and it's very convenient for testing. And yes, a few test/build files in my load-path won't cause major issues, but having subdirectories isn't going to cause any major issues either.

minad commented 9 months ago

@Stebalien Let me first say that I am sorry that we are over-discussing this simple issue. At least it is a rather harmless decision. I think we can both live with both possible layouts - flat or hierarchical. Note that I find your patches great and I am looking forward to what you will be coming up with next. You obviously know much more about the innards of Xelb and Exwm than I do. Hopefully we can still learn something from the process and find a mode of operation which will work well in the future.

Regarding changes in general or this change in particular, I am keen on keeping things simple and avoiding superfluous changes. The burden of proof should be such that a change must be backed by sufficient justification and bring significant advantages, which does not seem to be the case in my opinion. Changing the existing layout has a small associated cost in itself - git history browsing is slightly complicated, existing installations may be affected, ELPA changes needed, etc. Of course these are all only small costs. On the other hand, the advantages of a nested layout are also minor.

A flat hierarchy is the simplest layout. It works like this without any modifications on the side of ELPA, and is the default assumed by ELPA and MELPA. Introducing a hierarchy may lead to the tendency to accumulate unnecessary files (you mentioned "random files"), which we subsequently have to ignore or treat specially. This is something I have observed in multiple packages, and more often in the ones with deeper layouts. Given that Xelb and Exwm have worked like this for some time, introducing a single additional xelb-tests.el does not justify changing the existing layout.

The examples you mentioned are of course great packages and prime examples, but there are surely exist many examples for the opposite, e.g., some of the simpler packages by Tarsius are also flat. Some of your examples are more complex than Xelb. Note that both Tarsius and Alphapapa use their own special build and/or linter systems with special requirements. Tarsius even uses his own package manager Borg.

Regarding package-vc - the paradigm of package-vc is opinionated, with its sharing of a single installation and development directory. I worry that it may be too limited for many uses cases and does not take lessons learned in other systems into account. Straight for example solves installation in a different way. There the repositories and the installation directory are separate. This way a clean load path is guaranteed, if this is the user's preference. Maybe Elpaca does that too? I have not investigated this matter further - given that I am happy with just use plain package.el. And again, package-vc handles flat packages like Compat just fine, with compat-tests.el on the load path.

I think the problems being discussed here can all be solved and should not require a change in layout:

Files which we do not want to distribute to the end-user should be ignored via .elpaignore.
I find it important that test files compile cleanly. We also ensure clean compilation for compat-tests.el. (Unfortunately it is not immediately obvious there since we have some special CI code which runs the tests both with and without byte compilation. This is extra defense measure of Compat, which turned out to be necessary.)
Ideally we do not add any random files to the repository. Every Elisp file added to Xelb and Exwm should be loadable and compile cleanly, which helps maintenance in the long term. Having subdirectories would weakens such a practice.
Having xelb-tests.el and xelb-gen.el on the load path should not be harmful. It may even be beneficial. In the case of the tests it makes it easier to load and run them. xelb-gen.el could also be turned into a regular feature, such that it could be loaded and its individual functions could be tested or executed as usual from within Emacs.

Ultimately this issue boils downs to opinion and preference. Both alternatives work. If we cannot come to a decision, maybe @medranocalvo could state his preference. Generally in case of disagreement, preserving existing structure may be the better default, but again this is just my opinion. This does not mean that we should treat things as frozen. As soon as we hit limitations or see significant benefits of a change, we should of course apply the change.

Stebalien commented 9 months ago

If this were a massive refactor, I'd agree. But that's why I proposed just moving the el_client.el script into a scripts/build directory (the only actual change) and putting tests (new code, not existing code) into a test directory. I'm trying to find a compromise that works.

I've given my motivation for this change: I have a real and existing issue with the current layout. Sure, it may be minor, but it's an actual issue. Unfortunately "package-vc is bad" doesn't make my point any less valid: package-vc is part of Emacs core, works 99% of the time, and there's no good reason not to support it.

I'm asking for clarification because (a) the current situation is a problem for me and (b) I can't argue with "I don't like it".

minad commented 9 months ago

I also gave good reasons I think why I prefer a flat layout and that a flat layout introduces fewer problems. In the end it is preference, but there are also technical arguments which we have to weigh.

Can you please clarify which pressing issue you have with the existing layout or why you want to move files out of the way? I think I do not understand this sufficiently well. I argue that there should not be an issue with a flat layout. All .el files should compile cleanly and having them on the load path is an advantage for development and maintenance.

If files do not compile cleanly, the solution is not to move them out of the way, but to make sure that they compile cleanly. This is a maintenance issue which I consider important. I also do not want to clutter the repository with random files or directories which in the end only contain a single file.

I do not argue that package-vc is bad, but the design is opinionated, limited and ignores lessons from other systems. Now we rediscover potential problems which had already been solved. Furthermore a flat layout works well with package-vc and does not introduce compilation problems (as demonstrated by Compat), and may be even preferable since you have access to everything, which is advantageous for development.

Stebalien commented 9 months ago

I also gave good reasons I think why I prefer a flat layout and that a flat layout introduces fewer problems. In the end it is preference, but there are also technical arguments which we have to weigh.

I guess I missed that? I'm really just trying to find a compromise here, but I can't without knowing your full position. All I've seem so far is that you prefer a flat layout, but that doesn't help me find a compromise. If I knew why you had this preference, I'd be able to make progress.

Can you please clarify which pressing issue you have with the existing layout or why you want to move files out of the way? I think I do not understand this sufficiently well. I argue that there should not be an issue with a flat layout. All .el files should compile cleanly and having them on the load path is an advantage for development and maintenance.

I'd hardly call it pressing, but it's annoying:

I get random libraries added to my load path that shouldn't be.
E.g., loading el_client in emacs breaks the current emacs session because it's not a library, it's a script (it messes with the load path, turns on debugging, etc.).

These are, of course, minor issues. But they're real issues. I get not wanting to move all the lisp files to a separate directory, but moving just the tests and elisp scripts to subdirectories is pretty trivial, solves my issue, and seems to be how most projects handle this.

minad commented 9 months ago

Thanks, I understand. We could make sure that el_client.el (better named xelb-gen.el) and xelb-tests.el compile cleanly and can be loaded without negative effects. Having them on the load path should be even less of a problem then. Even right now, random files on the load path will not get loaded automatically. However we can do better, improving the script and creating a proper Elisp feature out of it. Would this be acceptable to you?

Overall I see three proposals:

We keep things as is and ensure that every .el file in the package directory is a feature which properly compiles and loads without side effects.
We create a few directories to move only a few files out of the way (test, script).
We reorganize completely, creating a meaningful structure by introducing lisp/, gen/, test/ or script/ directories. We can do this coherently for both Xelb and Exwm.

Stebalien commented 9 months ago

However we can do better, improving the script and creating a proper Elisp feature out of it. Would this be acceptable to you?

That's likely to complicate executing it. We'd have to run it with emacs -Q -l my_script.el --eval="(entrypoint ...)". Note: elisp scripts in a "scripts" directory isn't unusual:

Neither of those projects have a lisp/ directory (although evil does keep tests as an actual library, which does have some benefits while developing).

How about we start with a minimal version of 2:

Keep tests in the root, in an xelb-tests.el (or xcb-tests.el) file. They'll get included with package-vc, but that's not a huge issue and if you're using package-vc, you may actually want the tests in your load-path.
Put the codegen script in a scripts directory to keep it out of the load path.

minad commented 9 months ago

That's likely to complicate executing it. We'd have to run it with emacs -Q -l my_script.el --eval="(entrypoint ...)".

Execution will not necessarily become more complicated. We can detect script execution by inspecting command-line-args. Or we accept the complication, since the batch generation will be executed mostly from the Makefile anyway. But there are other advantages, see below.

How about we start with a minimal version of 2: Keep tests in the root, in an xelb-tests.el (or xcb-tests.el) file. They'll get included with package-vc, but that's not a huge issue and if you're using package-vc, you may actually want the tests in your load-path. Put the codegen script in a scripts directory to keep it out of the load path.

I prefer not to do this.

Let me recapitulate. The problem is that we have the Elisp file el_script, which is not a proper feature or Elisp library. If it gets loaded, it may break Emacs. Having the file on the load path is not a problem in itself, but still a little bit dangerous, since the file could get accidentally loaded more easily. Now we have two ways to handle this.

Move the problematic file out of the way into a sub directory (your preference).
Turn the file into a proper Elisp library (my preference).

Both 1 and 2 solve the problem we have.

I prefer Elisp libraries without undesired side effects since this eases maintenance. I consider it a good practice in general. One can evaluate or load the file without the fear of breaking something. Having the ability to load a library makes it possible to use it directly from within Emacs which eases testing by hand.

One may also evaluate during editing. For example for completion, one necessarily has to evaluate the file to make the symbols available. Not relevant for el_script, but if there are macros in a file, one has to evaluate the file to get proper macro indentation in the following code (To be precise, as a workaround one can only evaluate the macros one by one).

Additionally I prefer to not introduce sub directories if we don't have to, in the hope that we don't add further clutter in the future, but this is a minor concern.

Note: elisp scripts in a "scripts" directory isn't unusual:

Yes, such examples are not unusual. But Elisp scripts, which are not proper libraries, are not necessarily a good practice. I don't know if the aforementioned scripts belong to this class. Proper Elisp libraries have advantages as I pointed out. el_client.el is almost there. It is all functions, even all of them live in the xelb- namespace. The only problem is the side effect, which we can easily remove.

Therefore - would approach 2 of turning el_client into a proper library be acceptable or not? Note that we have a similar situation in Exwm. There we have exwm-config.el which you probably also don't want to use, but fortunately it does not have side effects such that loading it will not break Emacs.

Stebalien commented 9 months ago

But Elisp scripts, which are not proper libraries, are not necessarily a good practice.

You may feel that way, but I don't see that sentiment in the rest of the community. We even got a new -x flag to make it easier to write scripts with shebang lines.

How about making it not look like a library?

Remove the .el extension.
Add a #!/usr/bin/env -S emacs -Q --script line.
Mark it as executable.

That'll cause Emacs to exclude it from the load-path.

We can detect script execution by inspecting command-line-args.

Not cleanly. command-line-args isn't exclusive to scripts. We could check, noninteractive to see if we're in batch mode, but there's no way to tell if we're being used as a library in batch mode or as a script. The only reliable option is to either be a script or a library but not both.

I guess one option is to split it into two:

A small script.
A library loaded by the script.

Unfortunately, the el_client script works by evaluating code as it creates it to:

Ensure all the types are defined correctly.
Figure out where certain types are defined.

You're not going to get good results unless its run in a clean environment. We could rewrite it to avoid these side-effects, but that would be adding a bunch of complexity for no good reason.

Note that we have a similar situation in Exwm. There we have exwm-config.el which you probably also don't want to use, but fortunately it does not have side effects such that loading it will not break Emacs.

emacs-config is a proper library. I don't think it should be shipped with EXWM, but I can live with it if it gets pulled in by package-vc.

minad commented 9 months ago

Not cleanly. command-line-args isn't exclusive to scripts. We could check, noninteractive to see if we're in batch mode, but there's no way to tell if we're being used as a library in batch mode or as a script. The only reliable option is to either be a script or a library but not both.

To be precise - I mean to check command-line-arguments to see if it contains -x/--script and xelb-gen.el. This way we can distinguish between script execution and library usage. This will be reliable and it won't be a complication over the existing argv check.

You may feel that way, but I don't see that sentiment in the rest of the community. We even got a new -x flag to make it easier to write scripts with shebang lines.

This is not a feeling. I have written preference a few time, but please don't disregard the technical arguments I gave.

I gave arguments why a library is advantageous over a script. Yes, one can also write quick scripts but this is not better. A library has advantages when maintaining it, which I listed above. Which advantages do you see if we make it a script? If xelb-gen.el is a proper library there is no problem to keep it on the load path.

There exist many maintenance tools in the Emacs world which are not scripts but proper libraries, e.g., elpa-admin.el, package-lint.el and so on. Still, I appreciate the introduction of -x. It is good to extend our toolbox. The introduction is also understandable, given that many people write shell scripts and one could also wish to write Elisp scripts instead. Yet, the better approach is to write libraries.

minad commented 9 months ago

You're not going to get good results unless its run in a clean environment. We could rewrite it to avoid these side-effects, but that would be adding a bunch of complexity for no good reason.

This is certainly an argument. I have to look at el_client in more detail to evaluate what this would entail. Hopefully the change would not increase complexity, that's my hypothesis. In order to see, I would have to apply the changes.

How about making it not look like a library? Remove the .el extension. Add a #!/usr/bin/env -S emacs -Q --script line. Mark it as executable. That'll cause Emacs to exclude it from the load-path.

As a compromise and to resolve the situation here, I think I can agree to rename el_client.el to xelb-gen and do as you propose, if you are okay if I may try to turn xelb-gen.el into a library later on. It could very well be that this increases complexity or turns out to be unfeasible, but it shouldn't hurt to try? To be clear, this is not urgent, and I probably won't do this soon. I rather want to work on actual features and fix some bugs.

Stebalien commented 9 months ago

This will be reliable and it won't be a complication over the existing argv check.

Until I Import the library from another script, then the library thinks it's running as a script and everything breaks.

You may feel that way, but I don't see that sentiment in the rest of the community. We even got a new -x flag to make it easier to write scripts with shebang lines.

I gave arguments why a library is advantageous over a script. Yes, one can also write quick scripts but this is not better. A library has advantages when maintaining it, which I listed above. Which advantages do you see if we make it a script? If xelb-gen.el is a proper library there is no problem to keep it on the load path.

There's merit in factoring out complex logic from a script to:

Make it easier to maintain.
Make it reusable.

But that doesn't mean scripts are "bad practice".

Furthermore, scripts serve a different purpose:

Libraries need to work reliably with no intervention.
Libraries are, as their name suggests, libraries of reusable functionality.
Scripts (especially build scripts) usually fall into the category of "if it works, it works". I.e., you run them once as a part of the build process then move on.
Scripts generally don't contain generally-reusable components.

Yes, you could spend a lot of time trying to turn the script into a library, but then:

You'd over-complicate it (increasing the maintenance burden).
Spend time on the script that you could have spent on code that matters more.

If xelb-gen.el is a proper library there is no problem to keep it on the load path.

The cost (initial time, increased complexity, cost of maintenance, etc.) of making it a proper side-effect free library vastly exceeds the benefit of doing so.

It evals code into the global scope.
It has to accept arguments from the commandline.
It doesn't currently have to care too much about naming, conventions, what's public, what's private. It just "works".
It does one thing and I can't think of a single case where I'd want to "import it" into some other script/program. If/when we actually have a use-case, then we'd turn it into a library. Doing so beforehand would be premature and a waste of time.

There exist many maintenance tools in the Emacs world which are not scripts but proper libraries, e.g., elpa-admin.el, package-lint.el and so on.

Those aren't scripts, they're reusable libraries.

As a compromise and to resolve the situation here, I think I can agree to rename el_client.el to xelb-gen and do as you propose, if you are okay if I may try to turn xelb-gen.el into a library later on. It could very well be that this increases complexity or turns out to be unfeasible, but it shouldn't hurt to try? To be clear, this is not urgent, and I probably won't do this soon. I rather want to work on actual features and fix some bugs.

I'm happy with this compromise.

minad commented 9 months ago

Until I Import the library from another script, then the library thinks it's running as a script and everything breaks.

No, I assure you that this would not happen if the check is done carefully enough. One would have to to check the command line for both --script and xelb-gen.el.

But that doesn't mean scripts are "bad practice".

I don't want to go deeper into the discussion here, but I am looking at this with my "Emacs glasses" on. For me Emacs functions and Emacs libraries can be used to replace the usual scripts which are common if you use a Unix-style workflow centered around a shell with custom utility scripts. A script becomes a library or a function which I can load into Emacs and run it via a key binding, eval or via M-x.

This may sound a little absurd, but we are talking about an Emacs desktop here, which one can build with EXWM and some other packages. The script el_client we are discussing here is almost a normal Elisp library.

Also some of the tools we commonly refer to as scripts gradually evolve with time. At first a tool starts as a simple sequential shell script, then it may get replaced by some more maintainable Python script and in the next step the functionality will be extracted in a library with classes and all the common abstractions. At that point the script is reduced to a simple launcher which instantiates some library classes and calls library methods.

Yes, you could spend a lot of time trying to turn the script into a library, but then: You'd over-complicate it (increasing the maintenance burden). Spend time on the script that you could have spent on code that matters more.

Of course. This could happen, but it remains to be seen if the complications would be significant. My hypothesis is that this would not happen, given that el_client.el is an "almost Elisp library". But I already said, I don't consider this high priority. Like you, I am convinced that our time should better be spent elsewhere.

Those aren't scripts, they're reusable libraries.

Well, the boundaries are fluid. I think in the case of elpa-admin.el I actually disagree. If you look at the corresponding GNUMakefile, elpa-admin.el is used in exactly the same way as we are using el_client.el - as a script invoked via a Makefile. Yet, elpa-admin.el is structured like a regular Elisp library, which brings the aforementioned advantages during development (completion, jumping to definition, macro expansion and so on).

I'm happy with this compromise.

Thanks, me too! It is definitely the easiest solution to just rename el_client.el to xelb-gen and be done with it for now. :)

Stebalien commented 9 months ago

No, I assure you that this would not happen if the check is done carefully enough. One would have to to check the command line for both --script and xelb-gen.el.

That's way too hacky. If we absolutely need a library and a script, we should just write a small script that imports the library.

With respect to libraries v. scipts, it comes down to not over complicating things for the sake of "reusability". If you try to make a script into a reusable library before you need it to be a reusable library, I guarantee you you'll:

Waste a ton of time, you'll probably never use it.
Make maintaining it difficult because you'll be maintaining an abstraction you don't use.
Realize you got all the interfaces wrong if/when you do need to eventually reuse the code.

I.e., premature abstraction.

minad commented 9 months ago

I think we just disagree on these things.

I tried to explain my thoughts. I gave examples (python scripts, elpa-admin.el). There is no exact right or wrong on this matter. But you are shrugging these arguments of as premature abstraction. Why is this necessary? These generic terms "premature abstraction" or "premature optimization" are not useful. They are a general guideline and that's it.

Writing something like xelb-gen.el and elpa-admin.el as libraries is not something I call "premature abstraction". The code is not intended to be reusable, but writing it in the form of a library has advantages, I've mentioned multiple times now (no danger when evaluating, completion, jump to definition, macro expansion, ...). We can ask Stefan Monnier why he wrote elpa-admin.el the way he did. I don't think the goal was to make the code reusable, but rather more maintainable and safer. Maybe there is also a misunderstanding. When I say library, I don't mean an elaborate construct. A file without side effects, with a provide statement and a few defuns already makes up an Elisp library.

In the end we agree that time should not be wasted.

Stebalien commented 9 months ago

I tried to explain my thoughts. I gave examples (python scripts, elpa-admin.el). There is no exact right or wrong on this matter. But you are shrugging these arguments of as premature abstraction. Why is this necessary?

Because this script is not the same as those scripts. This code was not designed to run as a library (it says so right in the code) and, by not trying to be a library, it can save a fair amount of complexity. The code:

Evaluates code as it writes it.
Figures out which symbols to reference/call based on whether or not they've been defined.
Detects errors by noticing that certain symbols should have been defined but haven't been.

None of that would work as a library. It needs to be executed in a clean Emacs environment, and works by evaling code in that environment.

Look, if it were easy to rewrite it as a library, I'd say:

Move most of it to an xelb-gen.el library.
Keep a small xelb-gen script that imports the library and executes it for easy integration with build systems.

You're absolutely right that Emacs's builtin lisp editing features work best for libraries. But it isn't a library, it currently gets added to the load path, and making it a library would be more work than it's worth.

minad commented 9 months ago

Thanks, this makes sense. The way the environment is mutated saves complexity for the script in its current form and it makes it difficult to get rid of the side effects. I agree with that. One could probably store the state in a separate obarray or use another data structure. I really cannot tell how much complexity this would add. Maybe I underestimate it and you overestimate it. ;) But again, as long as we don't see the need to turn the script into a library, figuring this out, is a waste of time.

emacs-exwm / xelb

[RFC] [WIP]: Proposed file reorg #11