R-T3.4-9: The defect-prediction tool must be able to ingest data, also in real time, from multiple sources.

dardin88 commented 4 years ago

ID	R-T3.4-9
Section	WP3: Methodology and Quality Assurance Requirements
Type	FUNCTIONAL_SUITABILITY
User Story	As an Operations Engineer I want the tool to allow me to select the source (e.g., GitHub repository) from which the tool gathers data
Requirement	The defect-prediction tool must be able to ingest data, also in real-time, from multiple sources.
Extended Description	The tool must provide data ingestion connectors to multiple sources (e.g., repositories like GitHub, Jira, etc.) to allow the users to link their repositories to the tool. This allows the real-time data ingestion and defect prediction as well as to gather more data on which to constantly train the defect predictor model.
Priority	Must have
Affected Tools	DEFECT_PRED_TOOL
Means of Verification	Direct implementation of connectors to at least Github VCS, Feature checklist
Dependency	R-T3.4-10 https://github.com/radon-h2020/radon-defect-prediction-api/issues/5

stefanodallapalma commented 4 years ago

@gcasale Status: Partially addressed.

Currently, the tool can ingest data from Github (but not in real-time). A connector will be provided for GitLab to deal with the industrial case study with ENG. In the future, the Defect Prediction pipeline will provide a trigger to listen to the events (committing, opening/closing issue etc.) of a given repository and take actions accordingly for the training of a new model.

stefanodallapalma commented 3 years ago

@dardin88 I guess we can close this issue. The DPT has connectors to either Github and Gitlab. The requirement on real-time data ingestion can be removed as (1) not really needed, and (2) would add unwanted complexity. The training of new models can be scheduled once every week, month, and so forth, and it should not be a responsibility of the tool itself.

radon-h2020 / radon-defect-prediction-api

R-T3.4-9: The defect-prediction tool must be able to ingest data, also in real time, from multiple sources. #6