Closed daveaglick closed 8 years ago
Currently there are 4 different Modules two of them use input documents and add Metadata to those the other two just create documents from the repo information ignoring the input. I'm not sure if we should mix those two behaviors. Maybe we could make two modules instead of one. One Module to add Metadata to existing Documents, the other to create new Documents.
For the second point, I think we should make an overload where you can specify the folder but using a default if not. If we would use two Modules the default for the Git modules that create new documents could be InputFolder and for the Module that adds Metadata to files it could be the folder of the current document. Maybe I'm missing a use case but I don't think you could get any information from a git repo of a file that is not in a subfolder of that repo.
Ah, I see. GitFileCommits
and GitContributor
both add metadata with a collection to the input documents, with the Source
of each input document serving as the indication of which file commits/contributors to get. I'm not opposed to having two modules as you suggested, but what if all four operations cloned and output the input documents?
GitFileCommits
and GitContributor
would be the same as they are now and GitAuthors
and GitCommits
would add a metadata value that contains the collection of authors/commits to each input document (the same collection if we use the default source of InputFolder
or a different collection if we use a DocumentConfig
and change the Git repo we look at per-document). We'd no longer be able to "flatten" the CommitInformation
and Author
objects since they're going into a collection in an existing document, but I like this new idea better anyway. It would let us use the input documents as a control if we want different Git repos for example, and would cleanly associate aggregate information about those repos with the input document(s).
I'm fine keeping the default as InputFolder
- the overloads can be used if someone wants an alternative. Will this work if the root of my repo is one level up (for example, I store my entire site in Git including my wyam.config and I want to get the history for a specific Markdown file in my "Input" folder)? I.e., is libgit2sharp smart enough to open a repo if you give it a folder that's nested inside the repo?
GitAuthors
currently create a document for each Author containing the name and an collection of all files he contributed to in form of CommitInformation. The reason was to create a detailed page for every contributor that contains the files he contributed to. If we restructure the Data that it is more navigatable this would still be possible just with the Collection of Authors.
Maybe something like this:
With all relationships between this classes bidirectional.
For the last part you are right. That's how I use it currently. It starts the search at InputFolder
and moves up until it found a git repo.
So I finally got some time to look at this and ended up doing a pretty big refactoring. Hope you don't mind! I started out with some small changes, but new ideas kept coming to me about ways to make it more generally applicable and one thing led to another. There are now two modules, one for getting commits and another for getting contributors (which can be configured to get committers, authors, or both). For each module the user has the option to get all commits/contributors for the repository (which will generate new documents) or get ones specific to the input document (which will add metadata to the input documents). I also removed the special classes and relied on nested documents to contain the data - this approach has worked really well in the code analysis module because you can then pass metadata of one of the top-level documents (such as the list of commits for a specific file) to other modules since it's just a sequence of documents itself.
Can't wait to try it out :)
The Git modules could use a little refactoring before release. Here's what I have in mind:
Git
with fluent methods for each type of information:WithAuthors()
,WithCommits()
, etc. The output documents would be an aggregate of the different types of requested information (with a metadata key to identify which type of information the document contains). This will allow consolidation of logic and should be more efficient when getting multiple types of information (for example, the commit history for two different files).InputFolder
is where we want to look. This will let us work the Git module into a bigger generation that isn't just about Git data. You can still use theInputFolder
by passing it to the constructor using aContextConfig
delegate.CommitInformation
andAuthor
container objects we place the metadata directly in the output documents as standard .NET primitives. This will make it easier to consume from downstream modules and templates.DocumentConfig
delegate to supply the path for each input (which would still allow the current behavior of looking atIDocument.Source
).@LokiMidgard - I'm prepared to go ahead and make the changes, they shouldn't take long (you've already done the heavy lifting). Just wanted to get your thoughts before I do.