Closed patricknee closed 7 years ago
I checked in a fix for TransformerCli not reading Greenbook patents dumps.
I was not passing in a file filter, also some Greenbook patents don't have an abstract.
On the bulk file your processing the first 10 patent documents do not have an abstract, but when I jump to the 100th document, it does have an abstract.
TransformerCli spot tested extracting from downloaded files from 1980, 1990, 2000, 2010, all will success. Thanks for the fix.
Freshly downloaded (with BulkDownloader) file from 1980 crashes with a NULL pointer: