mantono / DuplicateSearcher

Identification of Duplicate Tickets in Issue Tracking Systems for Software Development
0 stars 0 forks source link

Graph based data structure #12

Open mantono opened 8 years ago

mantono commented 8 years ago

Create a graph based data structure where related issues will have edges. This is in order to reduce the time complexity for lookup/similarity comparison as an array based model would have O(n²), which does not scale well for larger repositories. Due to limited time, this is a low priority issue.

Issues it depends on: #11

mantono commented 8 years ago

Current data structure/search algorithm with time complexity O(n²/2) requires 8 hours, 55 minutes and 20.015 seconds to process 15 025 issues.