shy942 / QueryReformulation

A bug localization technique that applies historical data and query reformulation.
http://homepage.usask.ca/~shy942/queryreform
0 stars 0 forks source link

Specifications and I/O of query reformulation #2

Open masud-technope opened 8 years ago

masud-technope commented 8 years ago

Please provide the specification, input and output of your algorithm. Then I can implement it. Please do not make this project your own responsibility since I am co-authoring. You focus on paper writing, result analysis and module integration.

shy942 commented 8 years ago

I have three input files:

  1. A File (Bug Info File) generated from bug reports collection containing Bug IDs and their Title
  2. A File (Git Info File) generated from git where first line contains the Bug ID and no of files changed for fixing this bug and in the next line contains source code links of all those files and this type of inputs continue.
  3. A folder (Source Codes folder) that contains all pre-processed source code files, where in each file first line indicates the link or path of this file within the local directory, in the next line consists of pre-processed token.

To Do List: Two things to do

  1. At first from File 1 and 2 (above) create a mapping from each bug report contents to its related source code file links.
  2. Now consider File 2 and Folder (3) for each source code file, create a link to its content.

That's all from now. Other specification is coming

shy942 commented 8 years ago

Attached contents (File 1 and 2) BugInfoFile.txt GitInfoFile.txt

shy942 commented 8 years ago

Attached content (Some source code files)

Attached contents (Example source code files in a folder) ExampleSourceCodeFiles.zip

)

masud-technope commented 8 years ago

Looks like the data are here. Now, I need a clear specification what I have to do with this data. Please explain the requirement in text and if possible using diagrams. I cloned the repository and will start working on this.

shy942 commented 8 years ago

Requirement Specification:

  1. Keyword- Source Code Linking: At this point, in one side we have pre-processed keywords associated with each bug report and on the other side we have a relationship information between bug report ID and buggy source code links. We construct a bipartite graph between keywords collected from a bug report to its buggy source code locations. Here, one or more keywords can be linked to one or more buggy source code file links and a source code file link can be linked to one or more keywords.
  2. Source Code-Code Token Linking: At this point, we create another association map database for source code corpus. To construct this database, we make several links between one or more buggy source code files into code-token, extracted from those changed source code files. This association maps buggy source code files into source code tokens. This association map is represented by a bipartite graph, where there is no relationship either among source code files or code-tokens.
shy942 commented 8 years ago

mapping

shy942 commented 8 years ago

New GitInfoFile is added here. GitInfoFile2.txt

masud-technope commented 8 years ago

Added new commit. Please check and keep working.

masud-technope commented 8 years ago

Make sure you understand the code changes I made and keep working on it.

masud-technope commented 8 years ago

Make a README file for the project with little description. Looks good.