Only select valid refactoring instances from the refactoringcommit table

mauricioaniche commented 4 years ago

You may also want to add an @Index in the column as to speed up later queries.

jan-gerling commented 4 years ago

For now, it seems it marks all of them as valid. That's one option.

I would go with this option, because otherwise, we miss the column in the data import for ml pipeline and potential users can see this is a relevant field.

Another one would be to leave as is in the data collection and, later, in a processing step, we decide which ones are valid or not (by creating this column or even deleting the rows).

This is definitely an option, but it is definitely faster and easier if we already do it during the data collection.

jan-gerling commented 4 years ago

You may also want to add an @Index in the column as to speed up later queries.

Done

refactoring-ai / predicting-refactoring-ml

Only select valid refactoring instances from the refactoringcommit table #195