statiqdev / Statiq.Web

Statiq Web is a flexible static site generator written in .NET.
https://statiq.dev/web
Other
1.65k stars 235 forks source link

Refactor Git modules #130

Closed daveaglick closed 8 years ago

daveaglick commented 8 years ago

The Git modules could use a little refactoring before release. Here's what I have in mind:

@LokiMidgard - I'm prepared to go ahead and make the changes, they shouldn't take long (you've already done the heavy lifting). Just wanted to get your thoughts before I do.

LokiMidgard commented 8 years ago

Currently there are 4 different Modules two of them use input documents and add Metadata to those the other two just create documents from the repo information ignoring the input. I'm not sure if we should mix those two behaviors. Maybe we could make two modules instead of one. One Module to add Metadata to existing Documents, the other to create new Documents.

For the second point, I think we should make an overload where you can specify the folder but using a default if not. If we would use two Modules the default for the Git modules that create new documents could be InputFolder and for the Module that adds Metadata to files it could be the folder of the current document. Maybe I'm missing a use case but I don't think you could get any information from a git repo of a file that is not in a subfolder of that repo.

daveaglick commented 8 years ago

Ah, I see. GitFileCommits and GitContributor both add metadata with a collection to the input documents, with the Source of each input document serving as the indication of which file commits/contributors to get. I'm not opposed to having two modules as you suggested, but what if all four operations cloned and output the input documents?

GitFileCommits and GitContributor would be the same as they are now and GitAuthors and GitCommits would add a metadata value that contains the collection of authors/commits to each input document (the same collection if we use the default source of InputFolder or a different collection if we use a DocumentConfig and change the Git repo we look at per-document). We'd no longer be able to "flatten" the CommitInformation and Author objects since they're going into a collection in an existing document, but I like this new idea better anyway. It would let us use the input documents as a control if we want different Git repos for example, and would cleanly associate aggregate information about those repos with the input document(s).

I'm fine keeping the default as InputFolder - the overloads can be used if someone wants an alternative. Will this work if the root of my repo is one level up (for example, I store my entire site in Git including my wyam.config and I want to get the history for a specific Markdown file in my "Input" folder)? I.e., is libgit2sharp smart enough to open a repo if you give it a folder that's nested inside the repo?

LokiMidgard commented 8 years ago

GitAuthors currently create a document for each Author containing the name and an collection of all files he contributed to in form of CommitInformation. The reason was to create a detailed page for every contributor that contains the files he contributed to. If we restructure the Data that it is more navigatable this would still be possible just with the Collection of Authors.

Maybe something like this: class-02

With all relationships between this classes bidirectional.

For the last part you are right. That's how I use it currently. It starts the search at InputFolder and moves up until it found a git repo.

daveaglick commented 8 years ago

So I finally got some time to look at this and ended up doing a pretty big refactoring. Hope you don't mind! I started out with some small changes, but new ideas kept coming to me about ways to make it more generally applicable and one thing led to another. There are now two modules, one for getting commits and another for getting contributors (which can be configured to get committers, authors, or both). For each module the user has the option to get all commits/contributors for the repository (which will generate new documents) or get ones specific to the input document (which will add metadata to the input documents). I also removed the special classes and relied on nested documents to contain the data - this approach has worked really well in the code analysis module because you can then pass metadata of one of the top-level documents (such as the list of commits for a specific file) to other modules since it's just a sequence of documents itself.

LokiMidgard commented 8 years ago

Can't wait to try it out :)