webtorrent / webtorrent-hybrid

WebTorrent (with WebRTC support in Node.js)
https://webtorrent.io
MIT License
519 stars 98 forks source link

Very high CPU usage even with no peers #120

Closed lucybythec closed 2 years ago

lucybythec commented 3 years ago

Hi there,

We are evaluating using WebTorrent for asset distribution, and ran into the following obstacle. Consider the following example application:

const fs = require('fs')
const WebTorrent = require('webtorrent-hybrid')
const client = new WebTorrent()

fs.readdir('files', function(err, list) {
  if (err)
    console.log(err)
  list.forEach(function(filename) {
    const id = decodeURIComponent(filename)
    const path = 'files/' + filename
    client.seed(path, function (torrent) {
      console.log(torrent.magnetURI)
    })
  })
})

files is a directory containing about 50 files each under 50 MB.

The issue we are seeing is that even when running the above code and doing nothing else, including any attempt to access these files from another client, the node process is using very high CPU usage and does not timely respond to other events (e.g. requests to a HTTP server running in the same process will time out).

I have seen similar reported problems regarding resource usage (https://github.com/webtorrent/webtorrent/issues/1026 and linked issues), however, this problem looks different in that there is no activity other than preparing the files for seeding, with no attempt to actually access them.

Using node --inspect and collecting a CPU profile showed that the high CPU usage seems to be the fault of the wrtc module. Here you can see the RTCPeerConnection constructor taking over 3 minutes to run:

image

Examining the node process with gdb revealed the following:

  1. The program creates a very large number of threads (over 1500).

  2. Most of these threads seem to have one of the following two stack traces:

    1. #0  0x00007f3db347aa47 in epoll_wait (epfd=46189, events=0x7f26ca893710, maxevents=32, timeout=-1) at ../sysdeps/unix/sysv/linux/epoll_wait.c:30
      #1  0x00007f3d9b53176d in epoll_dispatch () from /tmp/testapp/node_modules/wrtc/build/Release/wrtc.node
      #2  0x00007f3d9b52e857 in event_base_loop () from /tmp/testapp/node_modules/wrtc/build/Release/wrtc.node
      #3  0x00007f3d9b52cc4b in webrtc::(anonymous namespace)::TaskQueueLibevent::ThreadMain(void*) () from /tmp/testapp/node_modules/wrtc/build/Release/wrtc.node
      #4  0x00007f3d9b3c5ccd in rtc::PlatformThread::StartThread(void*) () from /tmp/testapp/node_modules/wrtc/build/Release/wrtc.node
      #5  0x00007f3db37516db in start_thread (arg=0x7f3d3598a700) at pthread_create.c:463
      #6  0x00007f3db347a71f in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:95
    2. #0  0x00007f3db3757fb9 in futex_reltimed_wait_cancelable (private=<optimized out>, reltime=0x7f3d35888c90, expected=0, futex_word=0x7f26ca88fd38) at ../sysdeps/unix/sysv/linux/futex-internal.h:142
      #1  __pthread_cond_wait_common (abstime=0x7f3d35888d58, mutex=0x7f26ca88fce8, cond=0x7f26ca88fd10) at pthread_cond_wait.c:533
      #2  __pthread_cond_timedwait (cond=0x7f26ca88fd10, mutex=0x7f26ca88fce8, abstime=0x7f3d35888d58) at pthread_cond_wait.c:667
      #3  0x00007f3d9b3c6333 in rtc::Event::Wait(int, int) () from /tmp/testapp/node_modules/wrtc/build/Release/wrtc.node
      #4  0x00007f3d9b71fde5 in webrtc::ProcessThreadImpl::Process() () from /tmp/testapp/node_modules/wrtc/build/Release/wrtc.node
      #5  0x00007f3d9b71f588 in webrtc::ProcessThreadImpl::Run(void*) () from /tmp/testapp/node_modules/wrtc/build/Release/wrtc.node
      #6  0x00007f3d9b3c5ccd in rtc::PlatformThread::StartThread(void*) () from /tmp/testapp/node_modules/wrtc/build/Release/wrtc.node
      #7  0x00007f3db37516db in start_thread (arg=0x7f3d35889700) at pthread_create.c:463
      #8  0x00007f3db347a71f in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:95
  3. The main node thread seems to spend most of its time in rtc::Thread::ClearInternal:

    #0  0x00007f3d9b3b7165 in rtc::Thread::ClearInternal(rtc::MessageHandler*, unsigned int, std::__1::list<rtc::Message, std::__1::allocator<rtc::Message> >*) () from /tmp/testapp/node_modules/wrtc/build/Release/wrtc.node
    #1  0x00007f3d9b3b90f3 in rtc::Thread::Clear(rtc::MessageHandler*, unsigned int, std::__1::list<rtc::Message, std::__1::allocator<rtc::Message> >*) () from /tmp/testapp/node_modules/wrtc/build/Release/wrtc.node
    #2  0x00007f3d9b3b62b3 in rtc::ThreadManager::Clear(rtc::MessageHandler*) () from /tmp/testapp/node_modules/wrtc/build/Release/wrtc.node
    #3  0x00007f3d9b60c0a4 in webrtc::DataChannelProxyWithInternal<webrtc::DataChannelInterface>::RegisterObserver(webrtc::DataChannelObserver*) () from /tmp/testapp/node_modules/wrtc/build/Release/wrtc.node
    #4  0x00007f3d9b220863 in node_webrtc::RTCDataChannel::RTCDataChannel(Napi::CallbackInfo const&) () from /tmp/testapp/node_modules/wrtc/build/Release/wrtc.node
    #5  0x00007f3d9b22530b in Napi::ObjectWrap<node_webrtc::RTCDataChannel>::ConstructorCallbackWrapper(napi_env__*, napi_callback_info__*) () from /tmp/testapp/node_modules/wrtc/build/Release/wrtc.node
    #6  0x00000000009b8c4f in v8impl::(anonymous namespace)::FunctionCallbackWrapper::Invoke(v8::FunctionCallbackInfo<v8::Value> const&) ()
    #7  0x0000000000be3f25 in v8::internal::MaybeHandle<v8::internal::Object> v8::internal::(anonymous namespace)::HandleApiCallHelper<true>(v8::internal::Isolate*, v8::internal::Handle<v8::internal::HeapObject>, v8::internal::Handle<v8::internal::HeapObject>, v8::internal::Handle<v8::internal::FunctionTemplateInfo>, v8::internal::Handle<v8::internal::Object>, v8::internal::BuiltinArguments) ()
    #8  0x0000000000be4e26 in v8::internal::Builtins::InvokeApiFunction(v8::internal::Isolate*, bool, v8::internal::Handle<v8::internal::HeapObject>, v8::internal::Handle<v8::internal::Object>, int, v8::internal::Handle<v8::internal::Object>*, v8::internal::Handle<v8::internal::HeapObject>) ()
    #9  0x0000000000cc0f14 in v8::internal::(anonymous namespace)::Invoke(v8::internal::Isolate*, v8::internal::(anonymous namespace)::InvokeParams const&) ()
    #10 0x0000000000cc15fe in v8::internal::Execution::New(v8::internal::Isolate*, v8::internal::Handle<v8::internal::Object>, v8::internal::Handle<v8::internal::Object>, int, v8::internal::Handle<v8::internal::Object>*) ()
    #11 0x0000000000b88e58 in v8::Function::NewInstanceWithSideEffectType(v8::Local<v8::Context>, int, v8::Local<v8::Value>*, v8::SideEffectType) const ()
    #12 0x00000000009c061a in napi_new_instance ()
    #13 0x00007f3d9b21ca72 in node_webrtc::RTCDataChannel::Create(node_webrtc::DataChannelObserver*, rtc::scoped_refptr<webrtc::DataChannelInterface>) () from /tmp/testapp/node_modules/wrtc/build/Release/wrtc.node
    #14 0x00007f3d9b24c43a in node_webrtc::RTCDataChannel* std::__1::__function::__policy_invoker<node_webrtc::RTCDataChannel* ()>::__call_impl<std::__1::__function::__default_alloc_func<node_webrtc::Wrap<node_webrtc::RTCDataChannel*, rtc::scoped_refptr<webrtc::DataChannelInterface>, node_webrtc::DataChannelObserver*>::GetOrCreate(node_webrtc::DataChannelObserver*, rtc::scoped_refptr<webrtc::DataChannelInterface>)::{lambda()#1}, node_webrtc::RTCDataChannel* ()> >(std::__1::__function::__policy_storage const*) () from /tmp/testapp/node_modules/wrtc/build/Release/wrtc.node
    #15 0x00007f3d9b257910 in node_webrtc::RTCDataChannel* std::__1::__function::__policy_invoker<node_webrtc::RTCDataChannel* ()>::__call_impl<std::__1::__function::__default_alloc_func<node_webrtc::BidiMap<rtc::scoped_refptr<webrtc::DataChannelInterface>, node_webrtc::RTCDataChannel*>::computeIfAbsent(rtc::scoped_refptr<webrtc::DataChannelInterface>, std::__1::function<node_webrtc::RTCDataChannel* ()>)::{lambda()#1}, node_webrtc::RTCDataChannel* ()> >(std::__1::__function::__policy_storage const*) () from /tmp/testapp/node_modules/wrtc/build/Release/wrtc.node
    #16 0x00007f3d9b24fe33 in node_webrtc::Wrap<node_webrtc::RTCDataChannel*, rtc::scoped_refptr<webrtc::DataChannelInterface>, node_webrtc::DataChannelObserver*>::GetOrCreate(node_webrtc::DataChannelObserver*, rtc::scoped_refptr<webrtc::DataChannelInterface>) () from /tmp/testapp/node_modules/wrtc/build/Release/wrtc.node
    #17 0x00007f3d9b24932c in node_webrtc::RTCPeerConnection::CreateDataChannel(Napi::CallbackInfo const&) () from /tmp/testapp/node_modules/wrtc/build/Release/wrtc.node
    #18 0x00007f3d9b24edcd in Napi::ObjectWrap<node_webrtc::RTCPeerConnection>::InstanceMethodCallbackWrapper(napi_env__*, napi_callback_info__*) () from /tmp/testapp/node_modules/wrtc/build/Release/wrtc.node
    #19 0x00000000009b8c4f in v8impl::(anonymous namespace)::FunctionCallbackWrapper::Invoke(v8::FunctionCallbackInfo<v8::Value> const&) ()
    #20 0x0000000001398bad in Builtins_CallApiCallback ()
    #21 0x00007ffcd400edb8 in ?? ()
    #22 0x00007ffcd400edf0 in ?? ()
    #23 0x0000000000000002 in ?? ()
    #24 0x0000000000000048 in ?? ()
    #25 0x0000002100000000 in ?? ()
    #26 0x00007ffcd400ed70 in ?? ()
    #27 0x0000000000000006 in ?? ()
    #28 0x00007ffcd400ee60 in ?? ()
    #29 0x000009d8f3ceb153 in ?? ()
    #30 0x00002fa7609a2d91 in ?? ()
    #31 0x0000000005c738b0 in ?? ()
    #32 0x00002e7ca1b00471 in ?? ()
    #33 0x00002e7ca1b00471 in ?? ()
    #34 0x000003399fc43201 in ?? ()
    #35 0x00002e7ca1b00471 in ?? ()
    #36 0x00003dcc9c583219 in ?? ()
    #37 0x00002fa7609a2af1 in ?? ()
    #38 0x00002fa7609a2d91 in ?? ()
    #39 0x0000000000000014 in ?? ()
    #40 0x0000000005d23f80 in ?? ()
    #41 0x00003dcc9c583219 in ?? ()
    #42 0x00002fa7609a2af1 in ?? ()
    #43 0x00002fa7609a2a99 in ?? ()
    #44 0x00002fa7609a2cc1 in ?? ()
    #45 0x00002fa7609a2241 in ?? ()
    #46 0x00002fa7609a2241 in ?? ()
    #47 0x00002fa7609a2191 in ?? ()
    #48 0x00002fa7609a21b9 in ?? ()
    #49 0x000027f98a1e5469 in ?? ()
    #50 0x000033791316ef59 in ?? ()
    #51 0x00007ffcd400eeb0 in ?? ()
    #52 0x0000000001393dea in Builtins_JSConstructStubGeneric ()
    Backtrace stopped: previous frame inner to this frame (corrupt stack?)

Looking at the implementation of rtc::Thread::ClearInternal, it does look like it contains some loops which might explain the high CPU usage. However, I can't say if the overall problem is due to a problem in the Google WebRTC library or the wrtc Node module or WebTorrent or somewhere else entirely.

What version of this package are you using?

4.0.3

What operating system, Node.js, and npm version?

Ubuntu 18.04.5 LTS, Node v14.13.1, yarn 1.22.10

What happened?

Unexpected high CPU usage and unresponsive application when seeding files with no other activity.

What did you expect to happen?

CPU usage should be reasonable and the application should remain responsive.

Are you willing to submit a pull request to fix this bug?

Possibly (with assistance), but the problem is not yet fully understood.

lucybythec commented 3 years ago

An update: as per #118 we tried to replace wrtc with node-datachannel. The above program changes as follows:

const nodeDataChannel = require('node-datachannel')
const client = new WebTorrent({
  tracker : {
    wrtc : nodeDataChannel
  }
})

As a result, the CPU usage is solved, so we can conclude that the problem does not lie with webtorrent-hybrid. However, we have been unable to confirm if the WebRTC functionality is actually working. In fact, we have been unable to confirm that it is working even without this change (using wrtc). I will file that as a separate issue.

lucybythec commented 3 years ago

Another update related to the above: the above change doesn't actually work as the two libraries (as expected) have very different APIs. It so happened that all errors caused by an incompatible API, including calls to undefined methods, were silently ignored. To see them, you need to raise the log level. For now we have given up trying to use webtorrent with Node and are evaluating libtorrent (with webtorrent support) instead.

github-actions[bot] commented 2 years ago

Is this still relevant? If so, what is blocking it? Is there anything you can do to help move it forward?