stackmuncher / stm_app

This software engineer profile builder turns your code into a detailed list of skills for an online directory of software developers.
https://stackmuncher.com
GNU Affero General Public License v3.0
22 stars 1 forks source link

Unreliable git-log parsing #43

Open rimutaka opened 2 years ago

rimutaka commented 2 years ago

The way git-log is parsed at the moment is unreliable. If the log is corrupt the logic will fail or produce an incorrect result.

E.g., a file named commit log.md from https://github.com/G-yhlee/coupang-reward-systemwas read as a new commit line placing log.md into the SHA1.


commit 6a01533392e808faec2d0a3076af9c26f31919ff
Author: G-yhlee <yhleesoft@gmail.com>
Date:   Fri Sep 3 20:08:20 2021 +0900

    i

.gitignore
book/server/s2c.md
commit log.md
``
The handling was somewhat improved in https://github.com/stackmuncher/stm_app/commit/2cbaf72f29db673b28f879033e14f46f19fc2293, but it is still unreliable.

## Suggested improvements
1. Check if SHA1 is fully compliant with its format
2. allow skipping log lines until a reliable commit section is encountered
3. switch to using --raw format to get the action performed on the file (A,M,D) and get a more structured output