Closed draciti closed 5 years ago
The methodology suggested in this ticket requires to process papers with data already submitted through the old form using the new pipeline, but this would overwrite data previously submitted by authors. We decided to compare old and new data by looking at the average number of entities provided per paper through the two forms and not by comparing results on the same papers. Can we close this issue @draciti ?
yes, closing
take 10-20 papers from the old pipeline, run them through the new pipeline and see if the entities put in by authors in the old one match what we are after
For example: Compare what has been entered in the 'text comments' of 'genestudied' and the list of genes extracted by the new pipeline
see spreadsheet here for the tables: https://docs.google.com/spreadsheets/d/1sS_uAjBJ2r5H90Lam62Ai0HunjwvfjnklkFNrDoNXeU/edit#gid=1929595460