cristibalan / braid

Simple tool to help track vendor branches in a Git repository.
http://cristibalan.github.io/braid
MIT License
457 stars 54 forks source link

Filtering/removing some files or subdirectories #120

Open dapaulid opened 1 year ago

dapaulid commented 1 year ago

Is there a way to leave out some files or directories? In my use case, the repository has several subdirectories, and I want to exclude some of them (e.g. tests, examples, unused plugins).

Adding multiple mirrors using the --path for each subdirectory unfortunately does not work for me, as they still contain some files that should not end up in my repository.

mattmccutchen commented 1 year ago

There's currently no way to exclude some files from a mirror. It would be a reasonable enhancement and I've thought a bit about how it might be implemented, but like many enhancements, it would be a considerable amount of work for a very small team with possible ongoing maintenance costs as well. So I'd like to see some convincing use cases first.

In your case, are the files you want to exclude actually causing a practical problem in the downstream repository, or are they just distracting? What is the directory structure of the mirror as a whole and the files you want to exclude? What patterns/rules would you write to exclude the files if such a feature were supported? (You can change the actual filenames if they are confidential, but the more realistic you can keep the example, the better.) Do you envision a need to update the filter rules on an existing mirror?

One important decision to make would be what language to use for the patterns. My impulse would be to use the gitignore language since anyone using Braid (and hence Git) should be already familiar with it or willing to learn it. The annoying thing is that the only way I can find to check the patterns against the file paths in the mirror (to perform the filtering) requires us to create a temporary repository to load the patterns into a fake gitignore file and run git check-ignore (and I haven't tested that, so I'm not sure it works). Alternatively, I imagine there's some glob language supported by a Ruby library that we could use. Then we could filter the file paths in-process, but there could be subtle differences in behavior compared to the gitignore language that users might have to learn to deal with.

dapaulid commented 1 year ago

Thank you for your insights!

For my use case, the files I want to exclude really cause a problem, as they won't compile: The downstream repository uses a different build system and targets a specific platform/architecture, while the mirror contains files/directories for several architectures.

Yes, I think it could be useful to update the filter rules on an existing mirror, depending on the type (whitelist/blacklist) and structural changes of the mirror.

I like your idea of using the .gitignore syntax. It might not always be that intuitive, but it will get the job done and is something people are familiar with, as you said.

In my case, it would look somehow like this:

Mirror: https://github.com/open62541/open62541/tree/1.3

# exclude all top-level files/folders by default
/*

# ignore the following files in any directory
**/.CMakeLists.txt

# include the following top-level files/folders
!/src
!/include
!/deps

# include some plugin subfolders
!/plugins
/plugins/*
!/plugins/crypto/
!/plugins/include/

# include the posix architecture subfolder
!/arch/
/arch/*
!/arch/posix/

# include some common architecture files
!/arch/network_tcp.c
!/arch/network_ws.c
mattmccutchen commented 1 year ago

Thanks for the details about your use case.

For my use case, the files I want to exclude really cause a problem, as they won't compile: The downstream repository uses a different build system and targets a specific platform/architecture, while the mirror contains files/directories for several architectures.

Have you considered adding logic to the downstream build system to exclude the open62541 files you don't want from the build? I know that excluding files for one reason or another is a common thing to do in build systems in general. If there's a reasonable solution in your build system, that would weaken the argument for adding filtering to Braid at this time.

dapaulid commented 1 year ago

Yes it is possible in our build system, but it would result in another file to maintain. And the unnecessary files still end up "polluting" our repository, which we would like to avoid.

Actually, my own simple script I used for vendoring just copied over the desired files from the repo, after removing the subdirectory first. Of course, we ran into the issue that patches were overwritten after an update. That's what brought me to git subtree first, and then to braid.

I somehow hoped that excluding files would be a common use case for vendoring, but maybe it's just me :)