src-d / datasets

source{d} datasets ("big code") for source code analysis and machine learning on source code
Other
323 stars 82 forks source link

Properties descriptions at web doc are out dated #93

Open dpordomingo opened 6 years ago

dpordomingo commented 6 years ago

from https://docs.sourced.tech/datasets/publicgitarchive/web/md/index

it is said that:

Column name Column description
EMPTY_LINES_COUNT Number of empty lines on files on commit pointed by default HEAD.
CODE_LINES_COUNT Number of lines with code on files on commit pointed by default HEAD.
COMMENT_LINES_COUNT Number of commented lines on files on commit pointed by default HEAD

But they contain a list of each of them, regarding to every language on that repository, and ordered as previous languages counters.