Open jonaschn opened 3 years ago
scriptDataPath is used only within some of the classes in tem.script package and in SimpleEvaluate class of tem.main package. In particular scriptDataPath is utilized to read the userID file. To me, the export step and the training set creation step are both clear, but I can't figure out how to create the test set. Any ideas?
@ifonema Did you noticed my fork https://github.com/jonaschn/TopicExpertiseModel? I refactored a little bit and improved the code documentation. Therein, I also mention how the test set is created:
@jonaschn Thank you for your fork, I am using it now.
Method readQATestDocs
in line 43 reads questions from testData.question
file as it can be seen from the following code snippet:
https://github.com/yangliuy/TopicExpertiseModel/blob/0f918d78a420b9cf03834576d6595060e64f8a56/src/tem/main/Documents.java#L77 https://github.com/yangliuy/TopicExpertiseModel/blob/0f918d78a420b9cf03834576d6595060e64f8a56/src/tem/main/Documents.java#L91
testData.question
file is created in the main
method of ExportTestDataForRank
class
https://github.com/yangliuy/TopicExpertiseModel/blob/0f918d78a420b9cf03834576d6595060e64f8a56/src/tem/script/ExportTestDataForRank.java#L35-L36
where testDataQuestions.id
file is used in order to export data from the database, but I don't know where the testDataQuestions.id
file is created.
A naive solution consists of creating a tab separated text file in which the second column contains the ids of questions belonging to the test set, but I'd like to know if there is in the repository the code for the automatic generation of the test set.
To be honest: I don't know. I didn't try to generate any test data for myself.
I could not find the data which is expected to be in the
scriptDataPath
. The Dropbox upload contains (to my knowledge) only the training (originalDataPath
) and test data (testDataPath
). https://github.com/yangliuy/TopicExpertiseModel/blob/0f918d78a420b9cf03834576d6595060e64f8a56/src/tem/conf/PathConfig.java#L5