Benchmarking - Githubissues

mrchristian commented 6 months ago

We need to benchmark ALR against other search methods, as well as have a test benchmark to see if system parts work working.

Lets look up recommended methods for benchmarking.

We will need to choose benchmarks too:

Amit examples
Others?

mrchristian commented 6 months ago

Add your notes here: Note the numbers will need reviewing.

ydv-amit-ydv commented 6 months ago

Standardized Installation Test: Time = 1/seconds Dependency failures = - 1,-2,etc Cross platform = + 1 each for windows, linux and mac , 2 for web-based Cross platform (docs) = + 0.5 each for windows, linux and mac Prerequisite specific = python or any other version specific ??

post install Depth of search = ordinal 0 ------> 10
Total retrieves per search = comparison a/b , or the s/n 1% , 5% , 10% relevancy test = ordinal scale -2 , -1, 0, +1, +2 Output formats = PDF +1, HTML/xml +2, Text, 1.5 Downstreaming applications = Feature extraction , summarization , (+1 for each)

Interface Documentation = Ordinal scale 0 ------> 5 UI = Command line = 1, interactive shell = 2 , local GUI (tk like) = 3 , web browser(html) = 4

Updates New updates (all major releases and bug fixes) in last 1 year = +n New updates (all major releases and bug fixes) in last 5 years = +n

Relevancy testing, (to be manually tested or using TF - IDF) algorithm made 1%, 5%, and 10% relevancy tests, assessed on an ordinal scale ranging from -2 to +2, 1% for the very large datasets 5% for the medium sized datasets 10% for small like 100 papers

semanticClimate / automated-literature-review

Benchmarking #19