OpenSextant / OpenSextantToolbox

A geotagger and entity extractor
Other
15 stars 7 forks source link

example document factory should use a url #8

Closed spacemansteve closed 10 years ago

spacemansteve commented 10 years ago

document factory should use a url so pdf, doc and other file types are correctly parsed. this allows users to have their data files processed by simply adding them to the input directory. without this, non-plain text files like pdf are parsed as plain text with bad results.

dlutz2 commented 10 years ago

Steve; The commented out lines just below that were meant to show handling non-plain text files. I added the plain text specific as experiment and forgot to switch it back to the general case. I will switch it back and add some comments. Does that work for you?

dlutz2 commented 10 years ago

Steve, Added comments and tweaked code.

spacemansteve commented 10 years ago

Good Morning!

Thanks for the quick response. Yes, your suggestion is perfect. For me, it was really helpful to run my PDF files through the test code to get a feel for OpenSextant.

Steve

On Fri, Feb 28, 2014 at 5:16 PM, dlutz2 notifications@github.com wrote:

Steve; The commented out lines just below that were meant to show handling non-plain text files. I added the plain text specific as experiment and forgot to switch it back to the general case. I will switch it back and add some comments. Does that work for you?

Reply to this email directly or view it on GitHubhttps://github.com/OpenSextant/OpenSextantToolbox/pull/8#issuecomment-36400199 .