It's a problem when we have too much code duplicated, because it's harder to maintain, people starts copying and pasting complex code without understanding it and it makes harder to evolve the codebase toward better practices over time. Doing a code duplication analysis can give insights about authors that promotes this practice, or parts of the codebase that could be extracted as reusable libraries to solve this problem.
Proposal
Check for duplicated lines on the entire repo and show the results in the perspective of authors with more duplicated lines, and files that have more duplicated lines. For a certain duplication, show also who was the author that created the file contents for the first time, to differentiate this author from authors that are simply copying the contents.
Problem being solved
It's a problem when we have too much code duplicated, because it's harder to maintain, people starts copying and pasting complex code without understanding it and it makes harder to evolve the codebase toward better practices over time. Doing a code duplication analysis can give insights about authors that promotes this practice, or parts of the codebase that could be extracted as reusable libraries to solve this problem.
Proposal
Check for duplicated lines on the entire repo and show the results in the perspective of authors with more duplicated lines, and files that have more duplicated lines. For a certain duplication, show also who was the author that created the file contents for the first time, to differentiate this author from authors that are simply copying the contents.
We still have to define how to show this info.