mholt / PapaParse

Fast and powerful CSV (delimited text) parser that gracefully handles large files and malformed input
http://PapaParse.com
MIT License
12.43k stars 1.14k forks source link

PapaParsing a Google Sheet yields CORS error #809

Open jakob1111 opened 4 years ago

jakob1111 commented 4 years ago

When I try to use PapaParse to fetch data from a Google Sheet I shared as a csv file using

function init() { Papa.parse(public_spreadsheet_url, { download: true, header: true, complete: showInfo }) }

I get the error

Access to XMLHttpRequest at 'https://doc-0s-4k-sheets.googleusercontent.com/pub/l5l039s6ni5uumqbsj9o11lmdc/jbagvjajh1t9gstae10ndfkjtg/1593143255000/112192102762685134803/*/e@2PACX-1vQB-VAHmJgZQ00hlOGySWx8kd0Cq4z7o1V47juQc3PcTHkCuCNNmd9YxHZW4cnzDjA71UH0eL85VE5i?gid=0&single=true&output=csv' (redirected from 'https://docs.google.com/spreadsheets/d/e/2PACX-1vQB-VAHmJgZQ00hlOGySWx8kd0Cq4z7o1V47juQc3PcTHkCuCNNmd9YxHZW4cnzDjA71UH0eL85VE5i/pub?gid=0&single=true&output=csv') from origin 'https://learning-web.github.io' has been blocked by CORS policy: No 'Access-Control-Allow-Origin' header is present on the requested resource.

meirroth commented 4 years ago

Having the same issue! It worked for me last week and now I'm getting exactly the same error! Driving me crazy x( Have you found a solution?

leucotic commented 4 years ago

I am having the same issue here too!!! I was able to get around it by using https://github.com/Rob--W/cors-anywhere/

Basically, you preface any fetch link with 'https://cors-anywhere.herokuapp.com/' and it fixes the issue. But ALL my projects are set up this way, so I have to fix ALL OF THEM!!!! and im very upset lmao

meirroth commented 4 years ago

I tested it last week for a client and it worked perfectly, and now it doesn't... very frustrating! It's a WordPress plugin that is loading the spreadsheet for me, and I don't know how to make it use this solution https://cors-anywhere.herokuapp.com/ I'm ditching the whole google spreadsheet thing, client will need to upload CSV manually to the website.

captainharrie commented 4 years ago

Also having the same issue - however after doing some testing, I've found it is not limited to just google sheets. The issue seems to be that papaparse can no longer download a csv file from a remote host - I have this page on my tumblr blog where I am using papaparse, https://diosmaden.tumblr.com/characters-old (/characters is currently redirecting to the neocities mirror) Which I have also mirrored on my neocities, https://diosmaden.neocities.org/characters.html

The CSV file is currently hosted on neocities as I thought it was an issue with google sheets. You'll notice that despite the two pages having identical code, only the neocities page is working - the only difference here is that the csv file is local to neocities... I also tried uploading the csv file to my google drive and my one drive, but neither of those links worked - only the file on the same domain.

I am also using papaparse on my comicfury website - I found the same thing, as soon as I uploaded the csv file to my website files as opposed to linking to an external host, papaparse began working again.

iobaldia commented 4 years ago

Same issue here,

Somehow the header Access-Control-Allow-Origin is not present and including it in "downloadRequestHeaders" property doesnt seem to work either.

Tried also to downgrade to 5.0 version and looks the same. Any idea?

akshaybabloo commented 4 years ago

I think you might need to have an Origin header for CORS to work

nathanvogel commented 4 years ago

It seems like this is the same issue as #353 but specifically with Google Sheets now. The redirect triggers a second request, and that second request does not have the 'Origin' header.

I think this is what happens:

  1. Client makes XHR request R1 to Server S1 with Origin header automatically set by browser (the client can not modify it).
  2. Server S1 answers with a 307 redirect to a different domain.
  3. XHR automatically forwards and makes a second request R2 to server S2. This can not be prevented (source). However, the browser sets the R2 'Origin' header to 'null' to protect the privacy of the client and prevent leaking the origin to a potentially undesired server, since S1 and S2 are on two different domains. (source) The 'null' Origin header can not be overridden.
  4. Additionally, making the second request manually seems impossible, because trying to get the redirect URL with XMLHttpRequest.responseURL or HTMLHttpRequest.getResponseHeader('Location') fails (null or empty string, source) since the request R1 terminates with the CORS error.

I'm not too sure about number 4, maybe someone knows a workaround to get the URL? However, even then it doesn't seem like something that can/should be fixed at the PapaParse library level.

FilipDominec commented 3 years ago

I have been fixing a similar problem, so I accidentally found this discussion. First I used 'https://cors-anywhere.herokuapp.com/' as a quick fix, but this stopped working recently.

Apparently a reliable solution was to implement a callback as such:

window.googleDocCallback = function () { return true; };
var url = 'https://docs.google.com/spreadsheets/d/e/' + googleid + '/pub?output=csv&range=A1:ZZ9999&callback=googleDocCallback';