yegor256 / cam

Classes and Metriсs (CaM): a dataset of Java classes from public open-source GitHub repositories
http://cam.yegor256.com
MIT License
23 stars 38 forks source link

filter out too small and too big repositories #228

Open yegor256 opened 7 months ago

yegor256 commented 7 months ago

Let's filter our the smallest repositories and the largest (by the number of files), maybe using this statistical approach: https://en.wikipedia.org/wiki/Percentile

h1alexbel commented 4 months ago

@yegor256 assign me on this one, please

yegor256 commented 4 months ago

@h1alexbel go ahead, please