tdwg / dwc-qa

Public question and answer site for discussions about Darwin Core
Apache License 2.0
49 stars 8 forks source link

Pipe separators are interfering with Markdown-based feedback loops #209

Open mjy opened 4 months ago

mjy commented 4 months ago

An observation.

We're starting to work with aggregated reports on data submitted to GBIF.

If we want to clean up reporting "formatting", to better round-trip feedback, then Markdown might be useful as an intermediate format for exchanging issues. However, when we want to include data values in those reports, and those values contain pipes, then we have rendering issues. Obviously we can escape pipes, but this requires another layer of handling.

I'm wondering 2 things: 1) Should we move away from suggesting pipes as delimiters? 2) Why doesn't TDWG simply require a specific (non-pipe) delimiter when defining multiple values per term? Surely this character-based standard would greatly increase data interoperability.

ben-norton commented 4 months ago

@mjy I cross posted this issue in the TAG repo fir the next meeting. https://github.com/tdwg/tag/issues/47

cboelling commented 4 months ago
2. Why doesn't TDWG simply require a specific (non-pipe) delimiter when defining multiple values per term?  Surely this character-based standard would _greatly_ increase data interoperability.

Struggling with pipe characters too. (2) would be my preferred solution.

ben-norton commented 4 months ago

@mjy @tucotuco @timrobertson100 Tim or John please correct me if I'm wrong. It is my understanding that Option 2 was the original directive. Many delimiters can be exceedingly problematic, commas especially. If you break down all of the possible common delimiters, pipes are arguably the least commonly used characters in string values. Hence, the current suggestion.

tucotuco commented 4 months ago

pipes are arguably the least commonly used characters in string values. Hence, the current suggestion.

That is exactly right. A change in that recommendation would have immense repercussions that I would be loathe to face without a proven better alternative.

MattBlissett commented 4 months ago

I think Markdown is an inappropriate format for sharing data, so I suggest escaping the characters or using HTML (<td>value | value</td>) which is also valid Markdown — though you'll then need to escape < and &.