Open mrchristian opened 6 months ago
Add your notes here: Note the numbers will need reviewing.
Standardized Installation Test: Time = 1/seconds Dependency failures = - 1,-2,etc Cross platform = + 1 each for windows, linux and mac , 2 for web-based Cross platform (docs) = + 0.5 each for windows, linux and mac Prerequisite specific = python or any other version specific ??
post install Depth of search = ordinal 0 ------> 10
Total retrieves per search = comparison a/b , or the s/n
1% , 5% , 10% relevancy test = ordinal scale -2 , -1, 0, +1, +2
Output formats = PDF +1, HTML/xml +2, Text, 1.5
Downstreaming applications = Feature extraction , summarization , (+1 for each)
Interface Documentation = Ordinal scale 0 ------> 5 UI = Command line = 1, interactive shell = 2 , local GUI (tk like) = 3 , web browser(html) = 4
Updates New updates (all major releases and bug fixes) in last 1 year = +n New updates (all major releases and bug fixes) in last 5 years = +n
Relevancy testing, (to be manually tested or using TF - IDF) algorithm made 1%, 5%, and 10% relevancy tests, assessed on an ordinal scale ranging from -2 to +2, 1% for the very large datasets 5% for the medium sized datasets 10% for small like 100 papers
We need to benchmark ALR against other search methods, as well as have a test benchmark to see if system parts work working.
Lets look up recommended methods for benchmarking.
We will need to choose benchmarks too: