serlo / serlo-export

Export Serlo.org and Mathe für Nicht-Freaks articles into various output formats
https://de.serlo.org/
Apache License 2.0
5 stars 1 forks source link

project structure proposal #145

Open vroland opened 5 years ago

vroland commented 5 years ago

This project has come a long way, since it started as a little latex export experiment. But It's scope and structure have changed, naturally. So I'd like to discuss some ideas regarding the development process and code structure.

The total amount of code this repository is shrinking, which is good. But all this code has not gone away, most of it just moved to the tool repositories for the parser, linter, exporter, sitemap parser and template renderer. I like to keep these separate, but there are some drawbacks with project management:

So here are some ideas on how to address these issues:

Since naming is hard, I'd like to propose mfnf-export and mfnf-export-tools so we can find better ones ;) Please let me know what you think. Is merging the repositories a good idea? Would you prefer a single repository? Or a completely different structure?

Lodifice commented 5 years ago

To clarify, we have to understand that we talk about two different things here:

  1. the core build system (mk directory except mk/scripts, config directory, and assets directory) -- Make code which defines what targets are build and how they are build, plus templates for some targets

  2. scripts and programs called by the build system which actually perform the tasks defined there (the mk/scripts directory and various other repositories

If we went full separation, this would mean extracting all the bash and Python scripts into single repositories. A full merge would mean moving multiple trees of Rust code into the main repository. I think we don't have to discuss about full separation, but I also don't like the full merge approach. In my opinion, this only makes sense if we have a large rust library (similar to the old Python lib) and build all of our executables from it. However, I do like the modular design too much to merge all the Rust code into one tree.

In my opinion, a good line of separation is the following: everything that is a standalone compiled project gets its own repository, so the structure would keep more or less the same. For issue tracking, the main repository (this one - but I agree that a name change is necessary) should be used. Issues in the other repositories are then only used for minor things such as performance optimization or refactoring which only affect that particular repository. For synchronizing new features across projects, git submodules can be used. I know that I sometimes talk bad about submodules, because people use them when it's not appropriate, but in this situation I think they're worth it. A feature branch in the main repository may checkout a specific commit or branch in the submodules, and our make init rule could perform a simple git submodule update instead of cloning specific revisions. We could also add a post-checkout hook which automatically updates submodules, so that people aren't bothered with it and don't forget it.

I agree to using pull requests more often and reviewing them before merging.

If you like my idea on submodules, I may implement that - though you would have to give me some time (we've just talked about using curl and jq instead of Python on Wednesday and you've already implemented it).

vroland commented 5 years ago

I'll do some reading about git submodules and come back to you about them. But the general idea sounds all right.

Of course, full merge or full separation does not make sense. But I'd like to add that having a rust repository would make it easier to coordinate changes across tools. Currently, every tool might use a slightly different version of the template specification, since the git revisions of their dependencies are pinned. In a single we could depend on the path to a crate rather than its git revision, avoiding inconsistencies in their dependencies. Of course they would still live in separate crates.

(Regarding your last note: I've only implemented article download ;) See it as a teaser :D )

kulla commented 5 years ago

Some ideas: Moving the repositories to https://github.com/serlo-org/ is a good idea. We can start a new project which tracks the issues from multiple repositories. See also https://github.com/serlo-org/project-mfnf2serlo/issues/27

vroland commented 5 years ago

@Lodifice I agree that git submodules seem to fit our use case quite well. With some make targets to simply update them they could also be quite usable. So do we agree on using separate repositories for tools, and integrating them via git submodules?