I wonder if it would make sense to weight the recall based upon the sum of the co-occurrences for all found candidates over the sum of the co-occurrences of all possible candidates for the tree names? So if a tree name Smith had two candidates: Smyth (900 co-occurrences) and Snith (100 co-occurrences), and we found Smyth but not Snith, then instead of the recall being 50% for Smith, it would be 90%.
I wonder if it would make sense to weight the recall based upon the sum of the co-occurrences for all found candidates over the sum of the co-occurrences of all possible candidates for the tree names? So if a tree name Smith had two candidates: Smyth (900 co-occurrences) and Snith (100 co-occurrences), and we found Smyth but not Snith, then instead of the recall being 50% for Smith, it would be 90%.