microsoft / Linux-CommA

The Linux Commit Analyzer tracks patches from upstream and downstream kernels.
MIT License
8 stars 8 forks source link

Improve get_tracked_paths() #66

Open avylove opened 1 year ago

avylove commented 1 year ago

get_tracked_paths() looks at the maintainers file for all tags in the repo and then filters so it's only looking at v4 and above. But you're not guaranteed to have all the tags locally, so the only one that will always get checked is the default one, origin/master.

There is a chance, a path we care about is in a previous tag, but not in master. So that wouldn't be caught. There's really no way to know unless you make sure all the tags are local. Previously this was accomplished by pulling the complete history, but that's expensive if you do it over and over.

It's also problematic because this is a hard-coded limit.

A potential fix is to provide an option such as --fetch-all to pull the complete history in cases where the cost is accepted or the full repo is already available. Another potential fix is to use another source for this data, such as GitHub. Since the goal is to get history of a single text file, this would be less bandwidth intensive than pulling the whole repo. This would make more sense if this logic is moved to a Linux-specific plugin rather than being in the main program logic.