steemit / hivemind

Developer-friendly microservice powering social networks on the Steem blockchain.
MIT License
73 stars 66 forks source link

initial posts_cache sync fails after first batch #98

Closed roadscape closed 6 years ago

roadscape commented 6 years ago

After writing the initial 1M missing cache records, hive tries to update some recently voted posts and crashes. As a workaround simply restarting the process would fix the condition. But:

  1. Hive should not be dirtying posts during initial sync.
  2. Why aren't votes being upgraded to inserts for missing pids?
2018-03-10 21:23:22: Mar 11 03:23:22 ip-172-31-51-184 docker/11923334e851[4257]: [SYNC] Got block 20472000 @ 2018-03-07T16:57:15 (22/s, 225rps, 24wps) -- eta 00:44
2018-03-10 21:23:22: Mar 11 03:23:22 ip-172-31-51-184 docker/11923334e851[4257]: [SYNC] Got block 20472989 @ 2018-03-07T17:46:51 (22/s, 210rps, 24wps) -- done in 4893:56, avg rate: 69.7/s
2018-03-10 21:23:22: Mar 11 03:23:22 ip-172-31-51-184 docker/11923334e851[4257]: [INIT] *** Initial cache build ***
2018-03-10 21:23:22: Mar 11 03:23:22 ip-172-31-51-184 docker/11923334e851[4257]: [INIT] 37299891 missing post cache entries
2018-03-10 21:23:22: Mar 11 03:23:22 ip-172-31-51-184 docker/11923334e851[4257]: [PREP] posts cache process: 1000000 inserts, 84158 upvotes
2018-03-10 21:23:22: Mar 11 03:23:22 ip-172-31-51-184 docker/11923334e851[4257]:  -- post 1000 of 1084158 (425/s, 822rps, 882wps) -- eta 42:24
2018-03-10 21:23:22: Mar 11 03:23:22 ip-172-31-51-184 docker/11923334e851[4257]:  -- post 2000 of 1084158 (486/s, 1023rps, 928wps) -- eta 37:02
2018-03-10 21:23:22: Mar 11 03:23:22 ip-172-31-51-184 docker/11923334e851[4257]:  -- post 3000 of 1084158 (450/s, 889rps, 913wps) -- eta 39:58
2018-03-10 21:23:22: Mar 11 03:23:22 ip-172-31-51-184 docker/11923334e851[4257]:  -- post 4000 of 1084158 (467/s, 974rps, 900wps) -- eta 38:28
...
2018-03-10 22:04:03: Mar 11 04:04:03 ip-172-31-51-184 docker/11923334e851[4257]:  -- post 999000 of 1084158 (453/s, 913rps, 899wps) -- eta 03:07
2018-03-10 22:04:03: Mar 11 04:04:03 ip-172-31-51-184 docker/11923334e851[4257]:  -- post 1000000 of 1084158 (356/s, 612rps, 853wps) -- eta 03:56
2018-03-10 22:04:03: Mar 11 04:04:03 ip-172-31-51-184 docker/11923334e851[4257]: Traceback (most recent call last):
2018-03-10 22:04:03: Mar 11 04:04:03 ip-172-31-51-184 docker/11923334e851[4257]:   File "/usr/local/bin/hive", line 11, in <module>
2018-03-10 22:04:03: Mar 11 04:04:03 ip-172-31-51-184 docker/11923334e851[4257]:     load_entry_point('hivemind==0.0.1', 'console_scripts', 'hive')()
2018-03-10 22:04:03: Mar 11 04:04:03 ip-172-31-51-184 docker/11923334e851[4257]:   File "/usr/local/lib/python3.5/dist-packages/hive/cli.py", line 20, in run
2018-03-10 22:04:03: Mar 11 04:04:03 ip-172-31-51-184 docker/11923334e851[4257]:     Sync.run()
2018-03-10 22:04:03: Mar 11 04:04:03 ip-172-31-51-184 docker/11923334e851[4257]:   File "/usr/local/lib/python3.5/dist-packages/hive/indexer/sync.py", line 47, in run
2018-03-10 22:04:03: Mar 11 04:04:03 ip-172-31-51-184 docker/11923334e851[4257]:     cls.initial()
2018-03-10 22:04:03: Mar 11 04:04:03 ip-172-31-51-184 docker/11923334e851[4257]:   File "/usr/local/lib/python3.5/dist-packages/hive/indexer/sync.py", line 79, in initial
2018-03-10 22:04:03: Mar 11 04:04:03 ip-172-31-51-184 docker/11923334e851[4257]:     CachedPost.recover_missing_posts()
2018-03-10 22:04:03: Mar 11 04:04:03 ip-172-31-51-184 docker/11923334e851[4257]:   File "/usr/local/lib/python3.5/dist-packages/hive/indexer/cached_post.py", line 282, in recover_missing_posts
2018-03-10 22:04:03: Mar 11 04:04:03 ip-172-31-51-184 docker/11923334e851[4257]:   File "/usr/local/lib/python3.5/dist-packages/hive/indexer/cached_post.py", line 163, in flush
2018-03-10 22:04:03: Mar 11 04:04:03 ip-172-31-51-184 docker/11923334e851[4257]:     cls._update_batch(tuples, trx)
2018-03-10 22:04:03: Mar 11 04:04:03 ip-172-31-51-184 docker/11923334e851[4257]:   File "/usr/local/lib/python3.5/dist-packages/hive/indexer/cached_post.py", line 315, in _update_batch
2018-03-10 22:04:03: Mar 11 04:04:03 ip-172-31-51-184 docker/11923334e851[4257]:     buffer.append(cls._sql(pid, post, level=level))
2018-03-10 22:04:03: Mar 11 04:04:03 ip-172-31-51-184 docker/11923334e851[4257]:   File "/usr/local/lib/python3.5/dist-packages/hive/indexer/cached_post.py", line 385, in _sql
2018-03-10 22:04:03: Mar 11 04:04:03 ip-172-31-51-184 docker/11923334e851[4257]:     % (level, pid, cls.last_id(), repr(post)))
2018-03-10 22:04:03: Mar 11 04:04:03 ip-172-31-51-184 docker/11923334e851[4257]: Exception: WARNING: new pid, but level=upvote. #1009853 vs 1009703, {'title': 'Meat Robots: Headlines From Another Dimension, Volume 5', 'created': '2016-09-21T18:59:39',....
roadscape commented 6 years ago

The bug in (1) is that promoted payments signal a payout-level cache update. They should not during initial sync; all the necessary data is pulled in when sweeping missing posts. Regarding (2), it's a sanity check we should leave in place; it helps catch bugs like this which should be caught higher up the chain.