near / nearcore

Reference client for NEAR Protocol
https://near.org
GNU General Public License v3.0
2.31k stars 615 forks source link

feat: handle `RpcError` properly #11822

Closed frolvanya closed 1 month ago

frolvanya commented 1 month ago

This PR fixes a small issue that was introduced in #11806

Internal error - 500 status code Handler error - 408 if it's a TIMEOUT_ERROR, otherwise 200 Request validation error - 400 status code

closes #11792

frolvanya commented 1 month ago

@race-of-sloths please, include my PR in the Race

race-of-sloths commented 1 month ago

@frolvanya Thank you for your contribution! Your pull request is now a part of the Race of Sloths! Are you going to win race-of-sloths this month? If so, you should speed up! Run, Sloth, run!

[

<source media="(prefers-color-scheme: light)" srcset="https://badge.race-of-sloths.com/frolvanya?type=bot&theme=light">
<img alt="Shows profile picture for the author of the PR" src="https://badge.race-of-sloths.com/frolvanya?type=bot&theme=white">

](https://race-of-sloths.com/profile/frolvanya)

Current status: executed | Reviewer | Score | |--------|--------| | @frol | 2 | **The average score is 2** @frolvanya check out your results on the [Race of Sloths Leaderboard!](https://race-of-sloths.com/leaderboard) and in the [profile](https://race-of-sloths.com/profile/frolvanya)
What is the Race of Sloths

Race of Sloths is a friendly competition where you can participate in challenges and compete with other open-source contributors within your normal workflow For contributors: - Tag @race-of-sloths inside your pull requests - Wait for the maintainer to review and score your pull request - Check out your position in the [Leaderboard](https://race-of-sloths.com/leaderboard) - Keep weekly and monthly streaks to reach higher positions - Boast your contributions with a dynamic picture of your [Profile](https://race-of-sloths.com/profile/frolvanya) For maintainers: - Score pull requests that participate in the Race of Sloths - Engage contributors with fair scoring and fast responses so they keep their streaks - Promote the Race to the point where the Race starts promoting you - Grow the community of your contributors Feel free to check [our website](https://race-of-sloths.com) for additional details!

Bot commands - For contributors - **Include a PR:** `@race-of-sloths include` to enter the Race with your PR - For maintainers: - **Assign points:** `@race-of-sloths score [1/2/3/5/8/13]` to award points based on your assessment. - **Reject this PR:** `@race-of-sloths exclude` to send this PR back to the drawing board. - **Exclude repo:** `@race-of-sloths pause` to stop bot activity in this repo until `@race-of-sloths unpause` command is called
race-of-sloths commented 1 month ago

🌟 Score recorded!

@frol, thank you for scoring this pull request in the Race of Sloths!

codecov[bot] commented 1 month ago

Codecov Report

Attention: Patch coverage is 75.00000% with 2 lines in your changes missing coverage. Please review.

Project coverage is 71.81%. Comparing base (db0644f) to head (62268a4). Report is 16 commits behind head on master.

Files Patch % Lines
chain/jsonrpc/src/lib.rs 75.00% 2 Missing :warning:
Additional details and impacted files ```diff @@ Coverage Diff @@ ## master #11822 +/- ## ========================================== + Coverage 71.76% 71.81% +0.05% ========================================== Files 796 796 Lines 163183 163221 +38 Branches 163183 163221 +38 ========================================== + Hits 117110 117220 +110 + Misses 41025 40955 -70 + Partials 5048 5046 -2 ``` | [Flag](https://app.codecov.io/gh/near/nearcore/pull/11822/flags?src=pr&el=flags&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=near) | Coverage Δ | | |---|---|---| | [backward-compatibility](https://app.codecov.io/gh/near/nearcore/pull/11822/flags?src=pr&el=flag&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=near) | `0.23% <0.00%> (-0.01%)` | :arrow_down: | | [db-migration](https://app.codecov.io/gh/near/nearcore/pull/11822/flags?src=pr&el=flag&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=near) | `0.23% <0.00%> (-0.01%)` | :arrow_down: | | [genesis-check](https://app.codecov.io/gh/near/nearcore/pull/11822/flags?src=pr&el=flag&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=near) | `1.35% <0.00%> (-0.01%)` | :arrow_down: | | [integration-tests](https://app.codecov.io/gh/near/nearcore/pull/11822/flags?src=pr&el=flag&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=near) | `37.82% <50.00%> (+0.01%)` | :arrow_up: | | [linux](https://app.codecov.io/gh/near/nearcore/pull/11822/flags?src=pr&el=flag&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=near) | `71.57% <75.00%> (+0.19%)` | :arrow_up: | | [linux-nightly](https://app.codecov.io/gh/near/nearcore/pull/11822/flags?src=pr&el=flag&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=near) | `71.39% <75.00%> (+0.03%)` | :arrow_up: | | [macos](https://app.codecov.io/gh/near/nearcore/pull/11822/flags?src=pr&el=flag&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=near) | `54.14% <75.00%> (-0.46%)` | :arrow_down: | | [pytests](https://app.codecov.io/gh/near/nearcore/pull/11822/flags?src=pr&el=flag&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=near) | `1.62% <0.00%> (+<0.01%)` | :arrow_up: | | [sanity-checks](https://app.codecov.io/gh/near/nearcore/pull/11822/flags?src=pr&el=flag&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=near) | `1.42% <0.00%> (+<0.01%)` | :arrow_up: | | [unittests](https://app.codecov.io/gh/near/nearcore/pull/11822/flags?src=pr&el=flag&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=near) | `66.28% <75.00%> (+0.06%)` | :arrow_up: | | [upgradability](https://app.codecov.io/gh/near/nearcore/pull/11822/flags?src=pr&el=flag&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=near) | `0.28% <0.00%> (-0.01%)` | :arrow_down: | Flags with carried forward coverage won't be shown. [Click here](https://docs.codecov.io/docs/carryforward-flags?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=near#carryforward-flags-in-the-pull-request-comment) to find out more.

:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Have feedback on the report? Share it here.

tayfunelmas commented 1 month ago

You may need to go over Nayduck tests to make sure the new RPC errors are handled correctly (eg. timeout error is checked correctly). Thanks.

frolvanya commented 1 month ago

You may need to go over Nayduck tests to make sure the new RPC errors are handled correctly (eg. timeout error is checked correctly). Thanks.

Okay, I'll add some tests today

frolvanya commented 1 month ago

@tayfunelmas Is it ok to extend nearcore/chain/jsonrpc/jsonrpc-tests/tests/rpc_query.rs instead of Nayduck?

P.S: Already did it and now code coverage is 75%. During this week I'm planning to split jsonrpc into a trait and a structure. After this I should be able to use https://crates.io/crates/mockall to create a mock of JsonRpcHandler::process function that would return an internal error for me. At least for now, that's the only way how I see adding a test case for the internal error

tayfunelmas commented 1 month ago

Okay, I'll add some tests today

Note that, I did not mean to add new tests (can be done later) but to check if the existing Nayduck tests still pass after this change, because the current tests assume even the errors will be responded with 200 and they check the fields of the response to see if it is an error or not.

tayfunelmas commented 1 month ago

@tayfunelmas Is it ok to extend nearcore/chain/jsonrpc/jsonrpc-tests/tests/rpc_query.rs instead of Nayduck?

P.S: Already did it and now code coverage is 75%. During this week I'm planning to split jsonrpc into a trait and a structure. After this I should be able to use https://crates.io/crates/mockall to create a mock of JsonRpcHandler::process function that would return an internal error for me. At least for now, that's the only way how I see adding a test case for the internal error

Yes, it should work. Thank you for adding the test.

aborg-dev commented 1 month ago

FYI, I believe our Python load testing infrastructure is also affected with the new error codes. If I understand correctly, after https://github.com/near/nearcore/pull/11806 we started returning code 500 instead of 200 for queries about account existence, e.g:

{
  "jsonrpc": "2.0",
  "id": "dontcare",
  "method": "query",
  "params": {
    "request_type": "view_account",
    "finality": "final",
    "account_id": "ft_0.node0"
  }
}

Our Python code for checking accounts was relying on this RPC returning 200 and now started throwing exceptions here:

[2024-07-23 13:27:46,687] spire/ERROR/root: Uncaught exception in event handler:
Traceback (most recent call last):
  File "/home/akashin/.pyenv/versions/3.11.9/envs/near-3.11/lib/python3.11/site-packages/locust/event.py", line 47, in fire
    handler(**kwargs)
  File "/home/akashin/repos/work/nearcore/pytest/tests/loadtest/locust/common/ft.py", line 235, in on_locust_init
    ft_contract.install(node, funding_account)
  File "/home/akashin/repos/work/nearcore/pytest/tests/loadtest/locust/common/ft.py", line 33, in install
    existed = node.prepare_account(self.account, parent,
              ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/akashin/repos/work/nearcore/pytest/tests/loadtest/locust/common/base.py", line 452, in prepare_account
    exists = self.account_exists(account.key.account_id)
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/akashin/repos/work/nearcore/pytest/tests/loadtest/locust/common/base.py", line 445, in account_exists
    return "error" not in self.node.get_account(account_id, do_assert=False)
                          ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/akashin/repos/work/nearcore/pytest/lib/cluster.py", line 346, in get_account
    res = self.json_rpc('query', query, **kwargs)
          ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/akashin/repos/work/nearcore/pytest/lib/cluster.py", line 263, in json_rpc
    r.raise_for_status()
  File "/home/akashin/.pyenv/versions/3.11.9/envs/near-3.11/lib/python3.11/site-packages/geventhttpclient/requests.py", line 43, in raise_for_status
    raise useragent.BadStatusCode(self.url, code=self.status_code)
geventhttpclient.useragent.BadStatusCode: URL http://127.0.0.1:3030: code=500

It's possible to fix the error handling in Python, but I suspect it will be a non-trivial amount of work as we will need to review each user of json_rpc method. Given the other comment from @tayfunelmas , this likely affects more clients, hence I'm wondering if changing the error code here brings more benefits than harm?

frolvanya commented 1 month ago

Since now throwing bad status codes can be considered as a normal operation, I think that we can remove .raise_for_status() completely or write a new utility that would allow only 200, 400, 408 and 500 status codes. However, this fix feels odd. @akashin What is your opinion on that?

hence I'm wondering if changing the error code here brings more benefits than harm?

I believe that if we will find a way to fix all tests with little to no effort, then it's worth it

frolvanya commented 1 month ago

FYI, I believe our Python load testing infrastructure is also affected with the new error codes. If I understand correctly, after #11806 we started returning code 500 instead of 200 for queries about account existence, e.g:

And this was a bad way to do it. In this PR we return 200 for errors like "unknown account" The only handler error that return non-200 status code is a TIMEOUT_ERROR

aborg-dev commented 1 month ago

FYI, I believe our Python load testing infrastructure is also affected with the new error codes. If I understand correctly, after #11806 we started returning code 500 instead of 200 for queries about account existence, e.g:

And this was a bad way to do it. In this PR we return 200 for errors like "unknown account" The only handler error that return non-200 status code is a TIMEOUT_ERROR

Sounds good, in this case I'll wait for this PR to land and then work to fix any other problems that pop up in our Python libraries.

frolvanya commented 1 month ago

FYI, I believe our Python load testing infrastructure is also affected with the new error codes. If I understand correctly, after #11806 we started returning code 500 instead of 200 for queries about account existence, e.g:

And this was a bad way to do it. In this PR we return 200 for errors like "unknown account" The only handler error that return non-200 status code is a TIMEOUT_ERROR

Sounds good, in this case I'll wait for this PR to land and then work to fix any other problems that pop up in our Python libraries.

Great!

@tayfunelmas is PR ready to be merged or do you want me to add/fix something?

race-of-sloths commented 1 month ago

✅ PR is finalized!

Your contribution is much appreciated with a final score of 2! You have received 20 Sloth points for this contribution