RomainVialard / Google-Plus-Community-Migrator

https://docs.google.com/document/d/1UGhxaN5AiRXXL0Ki0DlVWLYJo_YYiEYhM2w1caRhljU/edit
12 stars 5 forks source link

Refactor #16

Open brainysmurf opened 5 years ago

brainysmurf commented 5 years ago

This is basically a rewrite of Code.gs; let me know if you want to continue on this path or not. I used some different design patterns that I think makes for cleaner code, and it works pretty well. But as with any big change, there might be some hesitation.

What it does:

Question:

RomainVialard commented 5 years ago

@brainysmurf actually I was able to get the G+ API quota raised on a specific cloud console project. So instead of advising people to switch from one project to another in order to get more quota, best might be to create a real Apps Script web app and advise everyone to use it instead of making copies of the script project.

It means that we would need a UI / form to let people enter mandatory parameters (their Firebase URL and search query in G+) and then we could use your method of using UrlFetchAll to do the export as fast as possible, without having to worry about the daily quota anymore.

brainysmurf commented 5 years ago

Oh that's a nice solution.

brainysmurf commented 5 years ago

Is this not getting merged? Did you ever get confirmation about your theory of quota /200 milliseconds?

RomainVialard commented 5 years ago

@brainysmurf Sorry, as this is heavily relying on UrlFetch quota, I'm still wondering if it's best to implement it or not. For people with many data to retrieve, it might eat up all their UrlFetch quota for the day (which would also impact other scripts they might be using). I've deployed a new version of the script as a web app (shared instance) with enough G+ API quota to retrieve everything. I'm investigating if it could make sense to test another way to parallelize calls to retrieve comments & plusoners while still relying on the G+ advanced service instead of UrlFetch...

brainysmurf commented 5 years ago

Not sure how that UrlFetch quota impacts scripts with different projects; there was one day I was doing the testing where I ran the full script at least 5 times — with different projects (by making copies). Anyway as I'm back in the swing of things I don't have a great amount of time to contribute further. I'd appreciate acknowledgment of the (theoretical?) work, however. :)

brainysmurf commented 5 years ago

Or do you mean because of the the user quota in effect for the single project that you are setting up, where it's a full web app. Wouldn't increasing the number of request items passed to UrlFetchApp.fetchAll help in that regard, as it only counts as 1 against the urlfetch quota. I tried with the inner loop being 100 (so 100202=4000 requests), and didn't seem to hit the per 200 milliseconds either.

RomainVialard commented 5 years ago

Sadly no, each fetch in the fetchAll() counts toward the quota. A gmail.com account can make 20K calls per day. So, if you want to retrieve 10K posts (eg: the content of the Apps Script community), you will need to do 20K calls to get comments and plusoners and will eat up your quota for the day, preventing other scripts to work well on your account.

Parallelizing calls to the API using fetchAll() is still a great idea, I'm just not sure it's the right one in our current use case.

Would you be interested to write a blog post with me about the different ways to parallelize calls in Apps Script? Taking the example of this export of G+ data or maybe switching to another API, like the Drive or Gmail one, which are more popular?

brainysmurf commented 5 years ago

It’s a fascinating topic, yes, happy to collaborate on that.

Maybe a python or node backend is best. No external request limit, and async/await is awesome.