mholt / PapaParse

Fast and powerful CSV (delimited text) parser that gracefully handles large files and malformed input
http://PapaParse.com
MIT License
12.47k stars 1.14k forks source link

Do not discard embedded metadata stored in comments as per W3C recommendation #1038

Open pbattino opened 8 months ago

pbattino commented 8 months ago

W3C recommendation for supporting embedded metadata is to use comment lines. In the example below the 3nd and 4th line:

# publisher City of Palo Alto
# updated 12/31/2010
#name GID on_street species trim_cycle  inventory_date
#datatype string  string  string  string  date:M/D/YYYY
  GID On Street Species Trim Cycle  Inventory Date
  1 ADDISON AV  Celtis australis  Large Tree Routine Prune  10/18/2010
  2 EMERSON ST  Liquidambar styraciflua Large Tree Routine Prune  6/2/2010

I understand parsing and reusing metadata in these comment lines may be seen as out of the scope of PapaParse. But if the parser could at least not discard these lines completely (when the option 'comments: "#"' is used) and simply pass them to the meta object, a developer could further reprocess them if needed.

see https://www.w3.org/TR/tabular-data-model/#embedded-metadata