platisd / duplicate-code-detection-tool

A simple Python3 tool to detect similarities between files within a repository
MIT License
162 stars 30 forks source link

Github Action fails to post when too many characters #11

Closed brianpaden289 closed 2 years ago

brianpaden289 commented 2 years ago

I am running on a legacy repo and it has lots of duplication. So much in fact that the output file is larger than Github allows to be posted to a comment.

Posting results to GitHub failed with code: 422 {"message":"Validation Failed","errors":[{"resource":"IssueComment","code":"unprocessable","field":"data","message":"Body is too long (maximum is 65536 characters)"}],"documentation_url":"https://docs.github.com/rest/reference/issues#create-an-issue-comment"}

Can the output be clamped in size to avoid this error?

My temporary workaround is to increase the ignore_below value.

platisd commented 2 years ago

Wow, I never thought of that. :sweat_smile: But yeah, the output could surely be clamped. Feel free to submit a pull request otherwise I can take a look in the weekend.

brianpaden289 commented 2 years ago

I also noticed in the output print many files have no results {}. Dropping those from the output dict would reduce character count for free as well.

platisd commented 2 years ago

@brianpaden289 can you try out #13 or tell me how to reproduce the issue you had? I believe it is a good-enough workaround for your issue.