d3 / d3-dsv

A parser and formatter for delimiter-separated values, such as CSV and TSV.
https://d3js.org/d3-dsv
ISC License
437 stars 76 forks source link

Could we possibly add a ssv() or wssv()? #70

Closed KennethKinLum closed 4 years ago

KennethKinLum commented 4 years ago

Besides fetching data by tsv() and csv(), could we support ssv() or wssv() for whitespace separated values?

For example, if the data is

x y
1 3
2.2 3.4
-1 -2

the data is quite well formatted and readable by human, and it also can be considered ok to be proper data and processed by a program. For files such as

x  y 
1     3
  2.2     -1.8
 3  4

it the data is not nicely formatted, but if a human can understand it, can it be also be allowed to be processed by the program as well.

The following can preprocess the second form:

const data = `x  y 
1     3
  2.2     -1.8
 3  4`;

console.log(data.split("\n").map(line => line.trim().split(/\s+/).join("\t")).join("\n"));
mbostock commented 4 years ago

You can make your own using d3.dsvFormat(" ").

https://observablehq.com/d/5234335063a77f4a

KennethKinLum commented 4 years ago

You can make your own using d3.dsvFormat(" ").

https://observablehq.com/d/5234335063a77f4a

thank you. Can it do the fetch as well? Would it be:

d3.text("data.txt")
  .then(txt => ssv.parse(txt))
  .then(data => {

  });

?

Seems like it can't. Seems like it needs to be:

    ssv = d3.dsvFormat("\t ");
    d3.text("multiple.tsv")
        .then(txt => {
            data = ssv.parse(txt)
            console.log(data);

but it seems d3.dsvFormat("\t "); needs to be d3.dsvFormat(" "); -- if there is

    3   \t 4

(some space and a tab) it is not possible to work this way? But if it is:

    ssv = d3.dsvFormat(" ");
    d3.text("multiple.tsv")
        .then(txt => {
            txt = txt.replace(/[\t ]+/g, " ");
            data = ssv.parse(txt);
            console.log(data);

then it can work.

mbostock commented 4 years ago

d3.dsvFormat only allows characters as delimiters; it doesn’t support multiple characters or regular expressions as delimiters. You can use space as a delimiter, or tab, but not both.

To fetch and parse a file:

d3.text(url).then(text => ssv.parse(text))

Added examples here:

https://observablehq.com/d/5234335063a77f4a

KennethKinLum commented 4 years ago

d3.dsvFormat only allows characters as delimiters; it doesn’t support multiple characters or regular expressions as delimiters. You can use space as a delimiter, or tab, but not both.

To fetch and parse a file:

d3.text(url).then(text => ssv.parse(text))

Added examples here:

https://observablehq.com/d/5234335063a77f4a

I appreciate it.

I was trying to replicate the case with spaces in front of the first value and this seems to work ok:

    ssv = d3.dsvFormat(" ")
    d3.text("multiple.ssv")
        .then(txt => {
            txt = txt.replace(/[\t ]+/g, " ").replace(/^[ ]/gm, "").replace(/[ ]$/gm, "");
            data = ssv.parse(txt);
            console.log(data);