ryu1kn / csv-writer

Convert objects/arrays into a CSV string or write them into a CSV file
https://www.npmjs.com/package/csv-writer
MIT License
246 stars 39 forks source link

Allow other delimiters #16

Closed actuallymentor closed 5 years ago

actuallymentor commented 5 years ago

Currentlyonly , and ; are allowed. It would be wonderful to just open this field up to be anything (including \t for example).

ryu1kn commented 5 years ago

Hi @actuallymentor , sorry for my slow response.

I actually prefer limiting the delimiter only to , as this is a Comma Separated Value writer. But as you know, ; is also supported because it is commonly used as a delimiter in some regions (👉 https://github.com/ryu1kn/csv-writer/pull/8#issuecomment-414106442), and without this, the library would be quite useless for some people.

I'm happy to support \t (👉 #10) as it is very common format even though I haven't found a reliable definition I can adhere to.

Do you have any specific use-case that you want to use any other characters as a delimiter? Happy to understand the background 🙂

actuallymentor commented 5 years ago

Thanks for the reply @ryu1kn

I actually prefer limiting the delimiter only to , as this is a Comma Separated Value writer.

While the C in CSV is indeed for comma, in many use cases CSV's tend to be more like Character Separated Values.

Think for example about SQL where in importing delimited data one sets the DELIMITED BY operator to any delimiter of choice.

And without this, the library would be quite useless for some people

I have not dived into the codebase for this package, but from a usefulness perspective allowing any delimiter increases the use-case space drastically without a large code change.

ryu1kn commented 5 years ago

Yes, it would be a minor code change, still I'm happy to keep the library as is until I see the real scenario where we want to use other characters. If you're facing a problem because of this limitation, let me know, that's the perfect timing to extend it.

actuallymentor commented 5 years ago

The aforementioned SQL import was giving me headaches. One of the field contained (comma and ; containing) html. I wanted to use an obscure character as delimiter.

I ended up discarding this package and encoded the html as base64 for the import process.

ryu1kn commented 5 years ago

Thanks for sharing the context. I can see where you're coming from.

Would allowing the arbitrary delimiter char suffice the need or does the character escape rule also need to be changed? In RFC4180, double quote characters are used to wrap a field or to escape a double quote in a field (👉 this)

actuallymentor commented 5 years ago

Would allowing the arbitrary delimiter char suffice

From a user perspective that is the intended behavior yes

does the character escape rule also need to be changed? In RFC4180 double quote characters are used to wrap a field or to escape a double quote in a field

Interesting. Some questions to figure that out:

  1. Many CSVs don't quote fields in " characters, would that be against the RFC?
  2. If the delimiter is 👻 that would imply the escaping of 👻 in the data columns wouldn't it?
  3. Perhaps this discussion should be considered outside of the official RFC spec's scope and the module could support a { rfc: false } option that allows for non-standard options

Note: I realise that this discussion is venturing in non-standard territory. It would be 100% understandable if you don't want to implement/maintain such functionality. I am still of the opinion though that it would help a number of use cases.

ryu1kn commented 5 years ago

Apologies for my late response! Thank you for your understanding on my cautious attitude on introducing non-standard behaviour. And yes, this is because, although I haven't tested this assumption, I think people feel this library more reliable if it tries to align with the standard (like I do).

  1. Many CSVs don't quote fields in " characters, would that be against the RFC?

That's on item 5-7 in section 2, and it's fine that you don't quote fields with " characters but you need to quote with " if the fields contain ", , or newline. But it also says, "This section documents the format that seems to be followed by most implementations".

  1. If the delimiter is 👻 that would imply the escaping of 👻 in the data columns wouldn't it?

If we just change a delimiter, we would just need to quote 👻, like apple👻orange👻"foo👻bar". But I don't know if the SQL import statement works fine with the double-quote escaping rule, i.e. can it parse apple👻orange👻"foo""bar" as apple, orange and foo"bar.

  1. Perhaps this discussion should be considered outside of the official RFC spec's scope and the module could support a { rfc: false } option that allows for non-standard options

Yes, if we introduce these changes, we want to make it clear ;)

ryu1kn commented 5 years ago

I'm going to close this for now, but I'll reconsider if this behaviour is what many other people require.

Thank you again for raising this suggestion. I appreciate your input!

EsrefDurna commented 5 years ago

Limiting delimiter is total lost opportunity. Many companies also people had to sync text data for example User notes, user inputs Encoding and decoding via different systems are not right you cannot expect an csv importer to be able to encode and de ode special characters. I forked this repository and will publish soon as a new npm module which doesn't dictates users what kind of separator can be used or not. Don't be a dictator, thats not good.

Thank you