hrenfroe / yahoo-groups-backup

A python script to backup the contents of private Yahoo! groups.
The Unlicense
3 stars 4 forks source link

Files are only 16k? #12

Open amajot opened 4 years ago

amajot commented 4 years ago

When I run dump_site after scrape_messages and scrape_files, the files it generates under /data/files/ are all the same filesize and usually are corrupt. One site (Kenwood_TS50) all of the files are a uniform 16k, and for another (IC-735) they are 112kb. Also if the site has folders, the same thing applies for files within a folder, they're all uniform size.

I suspect that it is creating the files with the correct file names and types, but only copying the content from the first file in a given folder.

Has anyone else seen this? I'm thinking to get around it I can use https://github.com/IgnoredAmbience/yahoo-group-archiver and just pull the files and copy/pasta them over into my data directory. But if its something easy i'm missing I'd like to just use this tool if possible.

hrenfroe commented 4 years ago

If it's something easy, I don't know what it is, so try the other tool and see if it works out for you. I'll try to look into the file issue, but my Chromedriver seems to be on strike at the moment.

amajot commented 4 years ago

I did end up using the workaround with the other script and have no complaints since it unblocks me. I was able to copy the files that it pulled in and overwrote the corrupt ones and boom, in business.