j0be / PowerDeleteSuite

Power Delete Suite for Reddit
1.68k stars 113 forks source link

Export stops after hashtag #34

Closed mbirth closed 1 year ago

mbirth commented 1 year ago

I've "prepared for export" my whole reddit account and the results page says something about 2901 exported entries.

Yet when downloading the items, I get only 23 items with the last one being this one and truncated at the #. The whole line in the CSV looks like this:

"","You can also follow hashtags, e.g.

And that's the end of the downloaded file.

This is with Firefox 114.0 on macOS.

HinataNatsumi commented 1 year ago

I can confirm the backup csv fails to “download” mid-way if the content has a hashtag # in it. It may or may not be related with #20.

I was able to use a workaround (using the inspect element) mentioned in #20 to get the “correct” csv downloaded, and compared it to the corrupted csv, and it does indeed seem to always cuts off at the hashtags. Unfortunately, using the workaround is still not not 100 percent perfect, as the formatting gets broken for me.

HinataNatsumi commented 1 year ago

I think the reason this happens is because when a hashtag # is put in a url, the csv download gets cuts off due to hashtag being used for specific purposes in urls (I wonder if this problem is related).

Either way, for now we can use workaround to get the complete backup copy. Just need to make sure the backup copy link doesn’t disappear before clicking the tab away.

donquxiote commented 1 year ago

Had the same issue. There were hashtags in the unicode for emoji in some posts or in a few other places. My quick work around:

  1. Inspect export button
  2. copy data to sublime text
  3. save backup of the link
  4. Find and replace # with -
  5. past link into browser
  6. verify contents. (I had a small posting history of ~400 items, it was all there)
CubGeek commented 1 year ago

For what it's worth, I followed @donquxiote 's suggestion to inspect the export button, and changed a step:

  1. Inspect 'Download Exported Items' button
  2. Right-click the element, select 'Edit as HTML' to open the attribute as an editable text box.
  3. Copy everything and paste into external text editor (I used Notepad++).
  4. Save a copy as backup, then open a new text file and paste again (so you don't work with the backup you just created).
  5. Do a global search and replace for the hashtag symbol (#) and replace it with the UTF-8 code for the symbol (%23).
  6. Copy everything and paste it back into the text box in the browser, click outside the text box to "set" the attributes.
  7. Click the 'Download Exported Items' button to download the CSV file.
  8. Reality check: looks like everything is there (I've got 10 years of post/comment history to check. So far, about 4 years back everything seems to be there).

Thanks to everyone who helped!

j0be commented 1 year ago

Yeah, known defect with how PDS builds the csv file. See the open PR if you're curious.