csaftoiu / yahoo-groups-backup

A python script to backup the contents of private Yahoo! groups.
The Unlicense
37 stars 18 forks source link

AssertionError: Stripped name didn't match author name #31

Closed vielmetti closed 8 years ago

vielmetti commented 8 years ago

I successfully pulled just short of 100 messages from a group, and then got this "assertion error". There's nothing obvious that I can see that would have triggered it.

Traceback (most recent call last):
  File "./yahoo-groups-backup.py", line 129, in <module>
    main()
  File "./yahoo-groups-backup.py", line 125, in main
    arguments, cfg_args)
  File "./yahoo-groups-backup.py", line 103, in invoke_subcommand
    return module.command(args)
  File "/Users/emv/src/yahoo-groups-backup/yahoo_groups_backup/subcommands/scrape_messages.py", line 50, in command
    msg = scraper.get_message(cur_message)
  File "/Users/emv/src/yahoo-groups-backup/yahoo_groups_backup/scraper.py", line 177, in get_message
    return self._massage_message(data)
  File "/Users/emv/src/yahoo-groups-backup/yahoo_groups_backup/scraper.py", line 127, in _massage_message
    stripped_name, data['authorName'], check_authorname,
AssertionError: Stripped name  didn't match author name klatta2@cox.net (check name was klatta2@cox.net)

Please let me know what else you need to help replicate or diagnose this issue.

csaftoiu commented 8 years ago

Hmm, interesting - can you share what group it was from? If so I can run it and see.

If not then can you change the error to also output data['from']? That would allow me to see what the logic error was, probably a name format I wasn't expecting.

In the meantime you can comment out these lines of code and continue scraping. This is just a sanity check to make sure no info is being lost in the stripping process. It may be working but the sanity check is failing, in which case this'll be safe. Or maybe there's a name in a format it's not expecting, in which case you may have an incomplete backup.

vielmetti commented 8 years ago

I added the error message as requested, which now reads:

AssertionError: Stripped name  didn't match author name klatta2@cox.net (check name was klatta2@cox.net) (data from was &lt;klatta2@cox.net&gt;)

The group is "vacuum-egroup", I can't recall just how private the archives are but I think you can get to them. The message number is 2039 if I'm reading it right.

vielmetti commented 8 years ago

This same error is happening on message number 7406 of the group "a2b3". Again if you need access to the archives let me know.

AssertionError: Stripped name  didn't match author name mtradem@comcast.net (check name was mtradem@comcast.net) (data from was &lt;mtradem@comcast.net&gt;)
csaftoiu commented 8 years ago

Ok, I think PR #32 fixes it - can you check out that branch and see if it works for you?

vielmetti commented 8 years ago

I made the fix and was able to successfully parse message number 7406 of the group "a2b3" and message 2039 of "vacuum-egroup", both of which had crashed before. Thanks !