NYCPlanning / data-engineering

Primary repository for NYC DCP's Data Engineering team
20 stars 0 forks source link

ESRI PDF Parser #920

Closed alexrichey closed 3 months ago

alexrichey commented 3 months ago

Add a very janky parser to make column metadata from the contents of an ESRI pdf. Also updates the documentation to list the ways one could generate metadata.

I wouldn't waste too much time reviewing the logic of the pdf-text parser. If this is something we continue on with, we might want to actually use a pdf-parser, or just grab the data from an FGDB or somehow from ESRI. (There's got be a better way!)

@damonmcc 👆in reference to your question about how to export md.