sailuh / kaiaulu

An R package for mining software repositories
http://itm0.shidler.hawaii.edu/kaiaulu
Mozilla Public License 2.0
19 stars 12 forks source link

Making Parsed Source Code Data Available Externally #314

Open daomcgill opened 6 days ago

daomcgill commented 6 days ago

Purpose

This issue is an extension of issue #313. The purpose here is to create configurable /exec scripts that make data tables available externally. The new scripts will add usability to the syntax extraction process by providing a usable way to perform source code annotations and XML querying.

Process

  1. Create script for annotating source code using srcML.
  2. Create script for querying the annotated data. This will accept a predefined query or a user-defined XPath query.
  3. Documentation

New Scripts

/exec/syntaxextractor.R: Script for running the syntax extractor using existing functions in R/src.R. The functionality for this is split into two parts:

  1. Annotation: Takes in a source code folder and uses srcML to generate an annotated XML file.
  2. Querying: Accept predefined XPath queries to extract syntactic elements from the XML files. Allows custom XPath queries to be specified by the user. Outputs the query results.

Task List


daomcgill commented 6 days ago

@carlosparadis part II

carlosparadis commented 6 days ago

@daomcgill For this one I would consider making two execs, one that annotates, and the other that can query the file. Annotating can take a long time depending on the size of the project, hence the split.

Otherwise, I think this is good! We can take another pass once #313 is done.

Thanks!