j0be / PowerDeleteSuite

Power Delete Suite for Reddit
1.57k stars 97 forks source link

Claims to export all deleted comments but only exports a few #20

Open DiogenesOfMiami opened 3 years ago

DiogenesOfMiami commented 3 years ago

Every time I run PDS it only exports a fraction of the comments it deletes and claims to export. I open the .csv and the vast majority of the comments it claimed to export aren't there:

https://i.imgur.com/Q0wL8Ma.png

JoshBoehm commented 3 years ago

Had a similar issue (just exporting, not deleting) where the script said a few thousand exported, but the CSV stops after 95 records for me.

fwn commented 3 years ago

I used PowerDeleteSuite just for backup. Excellent tool! Anyway regarding this bug: The export button href actually does contain all the data it claims. It's just that apparently, once clicked, only the first 32kb of the data URI actually get saved in the file. I tested this in the new MS Edge & Firefox.

As a workaround you can right-click the button and select "copy link". If you then paste your clipboard into a text file, it is the full sized export you wanted. This screws with the formatting, but that can be fixed by deleting the first 28 characters:

data:text/csv;charset=utf-8,

and running a urldecode over the rest of it. Which I did locally with a small PHP script like this:

file_put_contents('workingoutput.csv',urldecode(file_get_contents('rawinput.txt')));

One small caveat: In my case it gathered only the first year of comments (around a megabyte in my case), but that might be a reddit API restriction, I'm not sure.

Okxa commented 1 year ago

I'll just add that on some browsers even copying the link adress might not copy it all.

However if you inspect the button element, the link elements' href parameter seems to have all the data, so you can copy it from it with browser devtools, by double clicking the href value to select all and copy with Ctrl+C.

Also that data is not in URL encoded format.

mbirth commented 1 year ago

Can you check whether it stops at a # for you? Because then it might be related to my issue #34.

RenaKunisaki commented 1 year ago

Just ran into the same problem. The last line is:

"","\

which definitely feels like a formatting issue.

If there are issues with huge data URIs, it seems like a much better method would be to add a big text field to the UI and add the data to it as it's being processed. That would avoid these problems, plus letting you copy partial results if the process gets interrupted.