sailuh / kaiaulu

An R package for mining software repositories
http://itm0.shidler.hawaii.edu/kaiaulu
Mozilla Public License 2.0
20 stars 13 forks source link

Enable Kaiaulu to reuse our Motif TSE Dataset #218

Closed carlosparadis closed 1 year ago

carlosparadis commented 1 year ago

Now we can reuse the motif analysis #210 from TSE, it would be helpful to also reuse its dataset: https://cdn.lfdr.de/stmc/#/projects.

Both Git Log and Mailing Lists are possible since they use the standard .git and .mbox files. The only limitation is the JIRA dataset, which is in XML (and I believe derived from a Python crawler I implemented many years ago).

Writing an XML parser for the JIRA data would, therefore, give us a pool of 26 projects for other studies to extend the conclusions of the prior work and reproducibility to an extent.