lahwaacz / wiki-scripts

Framework for writing bots, maintenance scripts or performing data analysis on wikis powered by MediaWiki
http://lahwaacz.github.io/wiki-scripts/
GNU General Public License v3.0
27 stars 12 forks source link

statistics.py: TypeError: '<' not supported between instances of 'datetime.datetime' and 'str' #56

Closed kynikos closed 5 years ago

kynikos commented 5 years ago

Hey there, today this happens in wiki-scripts f5bfc5171d1bb84992371120096727c0e64396f0 (statistics.py):

% python statistics.py -s
WARNING  API warning(s) for query {'action': 'query', 'prop': 'revisions', 'rvprop': 'content|timestamp', 'titles': 'ArchWiki:Statistics'}:
* Subscribe to the mediawiki-api-announce mailing list at <https://lists.wikimedia.org/mailman/listinfo/mediawiki-api-announce> for notice of API deprecations and breaking changes.
* Because "rvslots" was not specified, a legacy format has been used for the output. This format is deprecated, and in the future the new format will always be used.
INFO     Loading data from /home/dario/.cache/wiki-scripts/wiki.archlinux.org/AllRevisionsProps.db.json.gz ...
Traceback (most recent call last):
  File "statistics.py", line 301, in <module>
    sys.exit(statistics.run())
  File "statistics.py", line 101, in run
    self._compose_page()
  File "statistics.py", line 110, in _compose_page
    self.cliargs.us_minrecedits)
  File "statistics.py", line 221, in __init__
    self.modules = UserStatsModules(self.db_allrevsprops, round_to_midnight=True)
  File "/home/dario/dev/arch/wiki-scripts/ws/statistics/UserStatsModules.py", line 49, in __init__
    revisions = sorted(revisions_generator, key=lambda r: (r["user"], r["timestamp"]))
TypeError: '<' not supported between instances of 'datetime.datetime' and 'str'

I thought maybe my cache database got corrupted, but I've tried to restore a week-old backup and got the same exception.

Today I've also updated 4 python-* packages in the system:

community/python-beautifulsoup4  4.7.1-1      4.8.0-1
extra/python-mako                1.0.13-1     1.0.14-1
community/python-pytoml          0.1.20-1     0.1.21-1
community/python-sqlalchemy      1.3.5-1      1.3.6-1

However downgrading all of them didn't solve the problem either.

wiki-scripts has stayed on the same commit since June for what I can tell, but last week it was working for me, I'm out of simple ideas, maybe I could try even older database backups, or start to inspect the database values as wiki-scripts reads them I guess, but first it's better to know if it's only a problem on my system or you see it too, cheers.

lahwaacz commented 5 years ago

There is a function which parses all timestamp strings into proper datetime.datetime objects for all responses from the wiki's API. It also treats some special cases like converting infinity to datetime.datetime.max. Haha, apparently it does it even in the user field of these revisions :laughing:

lahwaacz commented 5 years ago

Should be fixed now, but since you have backups, I did not bother to write a migration for AllRevisionsProps.db.json.gz. Restoring a version from before July 24 should work.

kynikos commented 5 years ago

Lol, confirmed fixed (indeed I don't need a migration), but so can we consider User:Infinity the first who successfully hacked wiki-scripts? For the moment I've reported a CVE to warn the users of previous versions...

Thank you :P