Closed mxhdev closed 8 years ago
Regular expressions might be useful for finding all the comments
One Solution (Does not work for "#" comments) http://stackoverflow.com/questions/21017075/regex-to-find-sql-comments
Functionality was added with commit 3d9778778deeb776f1701c65c7921114e1be819d. Tasks without comments are documentated as "no comment found for [Submission]" in the report. Best algorithms for calculating the similarity score still has to be done. At the moment Levenshtein distance is implemented.
Changed with commit f7a043b722573dfd358004b38fe6a4cfc4a18983 to Cosine Similarity.
Introduction
After checking all submissions, the algorithm should compare the comments of all submissions in order to find duplicate submissions.
General Process
Problems / Things to Consider