reposense / RepoSense

Contribution analysis tool for Git repositories
https://reposense.org
MIT License
245 stars 154 forks source link

Deprecate the CommandRunner class #442

Open eugenepeh opened 5 years ago

eugenepeh commented 5 years ago

Our tool has the pre-requisite of existing git installation and command line dependency which makes it more tedious to set up compare to a general tool.

Additionally, this reduce the maintainability due to security issues such as command injection, and may also prevent us from developing it into a integrate-able tool such as gradle plugin.

Let's move towards alternative such as jgit and deprecate the use of terminal / command prompt.

dcshzj commented 3 years ago

This is quite an epic issue and in the interest of reporting my progress so far, here are some notes on my attempt to perform this migration.

Potential replacements

The following is a table of the various CommandRunner usages and their potential replacements in JGit:

Usage JGit API JGit Class Remarks
GitBlame.java Git.blame() BlameCommand, BlameResult Need to build git blame result format
GitBranch.java Git.getRepository().getBranch() Repository
GitCheckout.java Git.checkout() CheckoutCommand
GitClone.java Git.cloneRepository() CloneCommand
GitDiff.java Git.diff() DiffCommand, DiffEntry Need to build git diff result format
GitLog.java Git.log() LogCommand Need to build git log result format, might not have certain information
GitLsTree.java None None Need to build using RevWalk and TreeWalk (more info)
GitRevList.java None None Need to build using LogCommand instead (more info)
GitRevParse.java None None Potentially no replacements available (more info)
GitShortlog.java Git.log() LogCommand Need to build git shortlog result format

Resources

  1. A jgit-cookbook that provides examples and code snippets for using JGit
  2. The API documentation for JGit. At the time of writing, the latest version available to be used in Gradle is 5.9.0.202009080501
  3. A tutorial on using JGit with some examples
  4. A guide to JGit

Evaluation

Pros

Cons

Notes

JGit has quite some potential in replacing our use of CommandRunner and also provides better error handling in general. However, as it requires us to be interacting with the API directly, we also need to replace the regex "hacks" that were developed over time with the proper API function calls, which is not trivial especially without the understanding of why certain information is needed in the first place. The "new" exceptions also need to be handled, which will involve knowing how the various parts of the software interact with one another.

Our implementation of CommandRunner also features running commands asynchronously, which at present is only used for cloning repositories. However, it does not seem that JGit is capable of providing such a functionality, which can be a blocker in replacing the existing Git cloning functionality.

JGit itself is not a complete implementation of Git yet, so there may be some features that are not available, if we ever need them in the future. This will also increase our project's dependency on this plugin. It is also worth checking out libgit2, which is an alternative implementation of Git core and has an experimental binding in Java called jagged.

If this is a step in the right direction, I will try to replace the various usages of CommandRunner incrementally. This task would track that progress. I have also provided an example of what JGit can potentially do in PR #1454, starting with the GitBlame.java file.

fzdy1914 commented 3 years ago

@damithc Your input for this change? It is quite a big change though. We should not worry about the feature of JGit can provide as it is still being maintained quite often.

I also agree that we may need to do some regex hack to make it work for JGit.

Maybe we can pack a Git binary and use it if the user does not have local git installation?

damithc commented 3 years ago

Given that we already has a working implementation, we'll need a pretty strong reason to migrate i.e., either something very negative about the current choice or a big value added by the alternative. I'm not sure if there is enough incentive for as of now. 🤔 But I'm open to be convinced otherwise. @dcshzj is the value add worth the cost of changing? If not, we can put this on hold for now. The work done so far is still valuable and can be used as a starting point if we decide to switch in a future time.

dcshzj commented 3 years ago

Maybe we can pack a Git binary and use it if the user does not have local git installation?

I think if we manage to migrate to JGit, we would be able to avoid using Git itself, so that we can skip the step of having the user to install Git on their local machine.

I'm not sure if there is enough incentive for as of now. 🤔 But I'm open to be convinced otherwise. @dcshzj is the value add worth the cost of changing? If not, we can put this on hold for now. The work done so far is still valuable and can be used as a starting point if we decide to switch in a future time.

Currently, the main selling point is better error handling, as the environment is better controlled compared to using the existing CommandRunner approach which could have many different possible Git versions and configurations being used.

The main cost is having to redevelop the Git backend (on a similar scale as the projectify frontend issue), although I believe it can be done incrementally without significant disruptions to other aspects of the codebase.

I personally would prefer if we can start the switch over to JGit, as I feel that we would eventually reap the benefits of better stability in the long run. However, it also depends on whether it is aligned to the long-term goals/roadmap of this project, and whether there are other more pressing issues to work on.

dcshzj commented 3 years ago

Leaving this task up for grabs. This task is ideal for an FYP project.

damithc commented 3 years ago

Thanks for the initial investigation @dcshzj Yes, let's put it on hold for the time being given that it requires rewriting a lot of code and we are not 100% sure if all the current features can be supported with JGit. We can revisit this later.

fzdy1914 commented 3 years ago

@dcshzj Thanks for your effort on this issue.

yhtMinceraft1010X commented 1 year ago

This is quite an epic issue and in the interest of reporting my progress so far, here are some notes on my attempt to perform this migration.

(Rest omitted for brevity)

The latest JGit version as of today is 6.4.0. It may be worth checking if some Git classes which did not have replacements available on JGit back in 2021 now have these replacements.

That being said, JGit 6.0.0 and above and above requires Java 11 and above.