One issue with the current approach is the potential errors in the parsed citations exracted using machine learning methods. It might be better just search the full bag of words of the citation against all the fields (https://stackoverflow.com/questions/15170097/how-to-search-across-all-the-fields). Probably using the catch-all field.
One issue with the current approach is the potential errors in the parsed citations exracted using machine learning methods. It might be better just search the full bag of words of the citation against all the fields (https://stackoverflow.com/questions/15170097/how-to-search-across-all-the-fields). Probably using the catch-all field.
A tool that just extracted the references section from a pdf, without parsing those references would be useful. https://www.crossref.org/labs/resolving-citations-we-dont-need-no-stinkin-parser/