The tfidf match speed is improved by getting the array only once before the loop.
The algorithm implementation was improved by removing the test data from training data.
Comparisons
After decreasing the match threshold from 80% to 30%, the match accuracy improved from 58% to 60%. Also, sight refactoring of code improved the evaluation time from 3238 seconds to 59.02 seconds.
The tfidf match speed is improved by getting the array only once before the loop. The algorithm implementation was improved by removing the test data from training data.
Comparisons
Following are the changes in matching results:
After removing the test data from training data, the accuracy further increased from 60% to 62%.
Following is the changes in result: