steemit / hivemind

Developer-friendly microservice powering social networks on the Steem blockchain.
MIT License
73 stars 65 forks source link

post cache AssertionError: hpc id X maps to Y #181

Open jrawsthorne opened 5 years ago

jrawsthorne commented 5 years ago

I'm re-indexing and the caching process keeps stopping due to an assertion error seen below. What is the best way to figure out what's wrong. Could it be the node sending bad data?

Traceback (most recent call last):,
  File "/usr/local/bin/hive", line 11, in <module>,
  File "/usr/local/lib/python3.6/dist-packages/hive/cli.py", line 24, in run,
    sys.exit(run()),
    Sync(conf=conf).run(),
  File "/usr/local/lib/python3.6/dist-packages/hive/indexer/sync.py", line 47, in run,
    self.initial(),
  File "/usr/local/lib/python3.6/dist-packages/hive/indexer/sync.py", line 90, in initial,
  File "/usr/local/lib/python3.6/dist-packages/hive/indexer/cached_post.py", line 289, in recover_missing_posts,
    CachedPost.recover_missing_posts(self._steem),
  File "/usr/local/lib/python3.6/dist-packages/hive/indexer/cached_post.py", line 172, in flush,
    while cls.flush(steem, trx=True, full_total=gap)['insert']:,
    cls._update_batch(steem, tuples, trx, full_total=full_total),
  File "/usr/local/lib/python3.6/dist-packages/hive/indexer/cached_post.py", line 322, in _update_batch,
    buffer.extend(cls._sql(pid, post, level=level)),
  File "/usr/local/lib/python3.6/dist-packages/hive/indexer/cached_post.py", line 392, in _sql,
    assert pid == pid2, "hpc id %d maps to %d" % (pid, pid2),
AssertionError: hpc id 1206280 maps to 1206230
roadscape commented 5 years ago

Did you run into any other issues prior to this?

Edit: also, the full logs would be helpful

roadscape commented 5 years ago

This should never happen, and has not happened to my knowledge since https://github.com/steemit/hivemind/issues/78 was resolved (until now).

jrawsthorne commented 5 years ago

I'll switch branch and get the logs for you. I'm using rpc.usesteem.com which I hadn't used before so it could be related. I haven't seen any other issues. I am on the latest commit

jrawsthorne commented 5 years ago

Running new branch https://pastebin.com/ZWXeL2P6 Logs from before error https://pastebin.com/ygrw6kPd

jrawsthorne commented 5 years ago

Looks like I might have been right. Changed the endpoint to api.steemit.com and it's running now. Hope he can fix it because it's gone from eta 1 day to eta 2w

roadscape commented 5 years ago

I'd be curious to know what the difference in the APIs is... ideally we could detect the condition and report on it.

jrawsthorne commented 5 years ago

Is there any way to roll it back to run through that post again

roadscape commented 5 years ago

No way to roll back, best approach for now may be to add better logging in case it happens again.

roadscape commented 5 years ago

Concerning; another report of this problem from @anyx who is connecting directly to a steemd.

fwiw I'm using my own rpc (full node, same as http://anyx.io but direct to steemd) AssertionError: hpc id 201 maps to 101 {'id': 100, 'author': 'steem-id', 'permlink': 're-red-steemit-firstpost-2', 'category': 'meta', 'parent_author': 'red', 'parent_permlink': 'steemit-firstpost-2', 'title': 'Nice', 'body': 'Can I get some :D', 'json_metadata': '{}', 'last_update': '2016-04-13T03:48:36', 'created': '2016-04-13T03:48:36', 'active': '2016-04-13T03:48:36', 'last_payout': '2016-08-24T19:59:42', 'depth': 2, 'children': 0, 'net_rshares': 0, 'abs_rshares': 0, 'vote_rshares': 0, 'children_abs_rshares': 0, 'cashout_time': '1969-12-31T23:59:59', 'max_cashout_time': '1969-12-31T23:59:59', 'total_vote_weight': 0, 'reward_weight': 10000, 'total_payout_value': '0.000 SBD', 'curator_payout_value': '0.000 SBD', 'author_rewards': 0, 'net_votes': 1, 'root_author': 'steemit', 'root_permlink': 'firstpost', 'max_accepted_payout': '1000000.000 SBD', 'percent_steem_dollars': 10000, 'allow_replies': True, 'allow_votes': True, 'allow_curation_rewards': True, 'beneficiaries': [], 'url': '/meta/@steemit/firstpost#@steem-id/re-red-steemit-firstpost-2', 'root_title': 'Welcome to Steem!', 'pending_payout_value': '0.000 SBD', 'total_pending_payout_value': '0.000 STEEM', 'active_votes': [{'voter': 'patrick-g', 'weight': 0, 'rshares': 724886327, 'percent': 10000, 'reputation': '1013326481658', 'time': '2016-07-22T19:57:54'}, {'voter': 'the-ego-is-you', 'weight': 0, 'rshares': 0, 'percent': 10000, 'reputation': '7030513148760', 'time': '2016-12-01T21:06:09'}, {'voter': 'bvidal', 'weight': 0, 'rshares': 0, 'percent': 10000, 'reputation': '293298398603', 'time': '2017-05-28T18:17:36'}, {'voter': 'setio', 'weight': 0, 'rshares': 0, 'percent': 10000, 'reputation': '8320716127', 'time': '2017-08-20T23:27:00'}], 'replies': [], 'author_reputation': '2224650372472', 'promoted': '0.000 STEEM', 'body_length': 0, 'reblogged_by': []}

This is the likely relevant stuff for http://anyx.io :

cmake: cmake -DBOOST_ROOT=/boost/boost_160/ -DCMAKE_BUILD_TYPE=Release -DLOW_MEMORY_NODE=OFF -DCLEAR_VOTES=OFF -DSKIP_BY_TX_ID=OFF -DBUILD_STEEM_TESTNET=OFF -DSTEEM_STATIC_BUILD=ON ../

Plugins:


plugin = webserver p2p json_rpc
plugin = database_api network_broadcast_api condenser_api block_api rc_api
plugin = account_history_rocksdb account_history_api
plugin = account_by_key account_by_key_api
plugin = tags tags_api
plugin = follow follow_api
plugin = market_history market_history_api```
roadscape commented 5 years ago

Interestingly, a bunch size of 50 doesn't run into this issue. Batch sizes of 100 and 1000 do.

After testing against anyx's node, it seems to be related to nginx caching. Using a batch size of 100 for 1000 posts, the first two batches were ok and did not hit cache. The other 8 were duplicates of the 2nd batch and headers indicated cache hit. Anyx reported that this issue did not occur with a direct connection to steemd.

roadscape commented 5 years ago

Anyx uncovered the bug: http://mailman.nginx.org/pipermail/nginx/2013-September/040442.html

Is it correct that when $content_length > client_body_buffer_size, then $request_body == "" ? If so this would be worth documenting at request_body.

I am using:

proxy_cache_methods POST;
proxy_cache_key "$request_method$request_uri$request_body";

Which works for small requests, but for large requests clients got very strange results due to $request_body being empty and hence getting false cache hits for completely different form posts.

Anyx:

The directive in nginx proxy_no_cache $request_body_file; is the workaround... basically, "dont cache if a file was created" which is what happens when the body is too large.

Hive could detect these symptoms earlier in get_content_batch and offer a more helpful error message. Need to evaluate if other methods may be affected.

quochuy commented 5 years ago

The same happens to me when using rpc.usesteem.com. Works fine with api.steemit.com and api.steem.house but both regularly trigger some ReadTimeoutError