Financial-Times / dj-sheet-reader

Reads Google Spreadsheets and transforms them into a simple data structure. Used inside Bertha.
MIT License
4 stars 1 forks source link

Columns to array conversion adding an extraneous null value #54

Open joannaskao opened 2 years ago

joannaskao commented 2 years ago

Since moving to the new version of the Google Sheet API and using https://github.com/Financial-Times/dj-sheet-reader/, turning columns into arrays seems to be adding a null value as the first value in the array.

Example Google Sheet (being used in production, so please make a new version before fiddling around with the content): https://docs.google.com/spreadsheets/d/1zUMzHn5SgZlAAfQJldvWKbaI5TSPqnzOig4YZqq6wo8/edit#gid=0 Bertha link: https://bertha.ig.ft.com/view/publish/gss/1zUMzHn5SgZlAAfQJldvWKbaI5TSPqnzOig4YZqq6wo8/data

kavanagh commented 2 years ago

Ok, so just for the record (in case someone doesn't have access to the sheet or it is changed later), here's an example of that use case.

Spreadsheet input:

column.1 column.2 column.3
a b c

JSON output:

[
  {
    "column": [
      null,
      "a",
      "b",
      "c",
    ] 
  }
]

I've never seen an array created in this way before. It's certainly not something I've ever tested, supported or documented. I'm actually surprised it worked before, especially in the way you say it did. I guess that this worked the way it did before by accident and probably a quirk of the underlying google API or even Node/V8 version change.

I would expect the current behaviour to be correct because the index starts at 1, rather than 0.

I'll have a look into why it was like this before and that might suggest what to do about it. I'd prefer not to support this case because it seems wrong to use an array index starting at 1.

kavanagh commented 2 years ago

Actually, thinking about this, the original behaviour should never have been to output an array but an object instead. This is what I would expect to have happened before:

Spreadsheet input:

column.1 column.2 column.3
a b c

JSON output:

[
  {
    "column": {
      "1": "a",
      "2": "b",
      "3": "c",
    }
  }
]

@joannaskao Do you have an example JSON that hasn't been republished since the upgrade (and therefore still has JSON created by the old version)?

Also, the the fact this worked before probably has something to do with the application code that consumed this data. if I could take a look at the lines of code that read in the choices array, that would help me understand it more.

I'm not sure why it's outputting an array now, I'll look into that too.

joannaskao commented 2 years ago

@kavanagh Here's an Bertha link to one that was republished before: https://bertha.ig.ft.com/view/publish/gss/1PVNh0fGXM3Su8J4tKAio6b9m24i2xdPHskS_moAmnXo/data (spreadsheet is https://docs.google.com/spreadsheets/u/1/d/1PVNh0fGXM3Su8J4tKAio6b9m24i2xdPHskS_moAmnXo/edit#gid=1248452580) (again both are in production, though I don't expect anyone to ever republish these again)