Closed crozone closed 4 years ago
cc: @stephentoub
@crozone, thanks for the bug report, including repro code. I just tried this out with .NET Core on both Windows and on Ubuntu (I don't have a Scientific Linux set up handy). I see a 100% CPU usage on both of those, but not in the call to ReadAsync. Rather, the calls to ReadAsync are completing immediately and returning 0, indicating that the response stream has ended and there's nothing more to be read. The spinning is coming from the while (true) loop in your repro that just repeatedly calls ReadAsync even if it's already returned 0:
while (true) {
// *** ISSUE ***
// TODO: This is where we get 100% CPU on RHEL
int byteCount = await eventStream.ReadAsync(buffer, 0, buffer.Length);
// *** END ISSUE ***
char[] chars = new char[decoder.GetCharCount(buffer, 0, byteCount)];
decoder.GetChars(buffer, 0, byteCount, chars, 0);
Console.Write(new string(chars));
}
I tried it using http://www.w3schools.com/HTML/demo_sse.php. Maybe there's something special about this endpoint vs yours that's causing a difference?
After further investigation, I'm now confident that I can reproduce this issue on RHEL SL, and that the issue doesn't occur on Ubuntu 14.04 and Windows 10. Against my event source (which is a non-public event source hosted at https://api.particle.io/v1/devices/events/, which I believe runs on top of node.js), the code correctly awaits the await eventStream.ReadAsync
, and only continues when there is at least one byte of data to be read. It only appears to return 0 bytes after the connection has been closed. Again on Ubuntu and Windows, this uses virtually no CPU, on RHEL SL, it pegs the core at 100%.
I believe the reason you were seeing the 100% CPU from the while(true)
loop in your repro attempt was because the w3schools event source demo doesn't behave like a typical event source would - it closes itself almost immediately, causing bytesRead
to return 0 and produce an infinite reconnection loop (poor repro code on my part). I've updated the repro code to be more verbose in its debug output, and added an await Task.Delay(1000)
on reconnect to prevent any tight loops.
To investigate further, I ran an strace -f
against the process on both Ubuntu and RHEL SL to see what calls were being made by both processes, and got significantly different results. Ubuntu appears to perform relatively few futex
s with a few wait4
s thrown in between them. SL performs similar futex
s, but with many poll
and clock_gettime(CLOCK_MONOTONIC...
syscalls in between them. Alarmingly, the output of strace for the program running for 15 seconds on Ubuntu was 200kb (about 3000 syscalls), on RHEL it was 11.5mb (about 115000 syscalls). It definitely looks like RHEL SL is spinning.
I have attached the new repro code, as well as the output the program's STDOUT, and strace outputs from the program running on both Ubuntu and RHEL SL (only the first few thousand lines from RHEL SL for size reasons).
New Repro Project: RHELBugTest2.zip
Program output: software-output.txt
Complete Ubuntu strace -f: ubuntu-14.04-AsyncRead-strace.txt
Partial RHEL SL strace -f:
redhat-7.2-AsyncRead-strace-partial.txt
EDIT: Changed RHEL to SL, since technically, this is a Scientific Linux 7.2 build, not actual RHEL 7.2(although they're built from practically the same source)
Dropping by as the sysadmin that runs said EL servers if any input is required as to any system config or other details that may be required. Don't be afraid to ask for info :)
@crozone and @CRCinAU, thanks for following up.
@crozone, would you be able to include an EventListener like https://github.com/dotnet/corefx/blob/master/src/Common/tests/System/Diagnostics/Tracing/ConsoleEventListener.cs in your repro, wrapping your main method in something like:
using (new ConsoleEventListener("Http"))
{
...
}
I'm curious to see what the log shows while this spinning is happening. This will turn on both libcurl's logging as well as additional logging we do in the System.Net.Http component, and will route it all to stdout (if you'd prefer to route it to a file, you could of course edit the ConsoleEventListener to write the data wherever you like).
Thanks!
Okay, I placed the using (new ConsoleEventListener("Http")){}
around the event source read loop, and ran it on both the Ubuntu and Scientific Linux machines.
Here are the outputs (with private Bearer tokens redacted 8-) )
Ubuntu: ubuntu-debug.txt
Scientific Linux: sl-debug.txt
They both drop into very similar loops which repeat every 15 seconds as LineFeed characters are sent down from the server (as heartbeats).
The main difference I noticed is that the Ubuntu machine spits out [Microsoft-System-Net-Http-6] (3, 0, WaitForWork, Wait wake-up).
, then [Microsoft-System-Net-Http-6] (3, 1, HandleIncomingRequests, Type: Unpause).
, and then continues to produce
[Microsoft-System-Net-Http-6] (3, 0, WaitForWork, Wait timeout).
every few seconds until the next NewLine character is read.
On the Scientific Linux system, it never outputs a "WaitForWork", it just hits [Microsoft-System-Net-Http-6] (3, 1, HandleIncomingRequests, Type: Unpause).
and then sits there until the NewLine is sent.
Is it possible that this behaviour is being caused by a differences in libCurl? libCurl appears to be "libcurl/7.29.0 " on the Scientific Linux installation, and "libcurl/7.35.0" on the Ubuntu installation.
Hi, our customer hit similar issue on CentOS, our coreclr console app is making long poll REST call. https://github.com/Microsoft/vsts-agent/issues/454
@TingluoHuang thanks for the extra data! Do you know which libcurl version is involved in your customer's case?
I'm so far unable to reproduce this myself, but this is likely because I don't have the same event source.
@crozone, thank you for sending the logs; I'm still looking through them. It may also be useful to collect a perf trace: https://github.com/dotnet/coreclr/blob/master/Documentation/project-docs/linux-performance-tracing.md.
Also, have you seen this behavior on the RTM .NET Core release?
Loop in my customer. :) @ppanyukov, can you provide the information @ericeil wanted?
Thanks, Ting
@ericeil the version on libcurl is this:
libcurl-7.29.0-25.el7.centos.x86_64
I will get a newer version and see if the problem still exists.
@ericeil This looks to be definitely related to libcurl
.
I have updated to libcurl-7.50.0-2.0.cf.rhel7.x86_64
and the problem has gone away.
The syscall stats look way more healthy now:
% time seconds usecs/call calls errors syscall
------ ----------- ----------- --------- --------- ----------------
70.00 0.028000 17 1672 255 futex
20.00 0.008000 242 33 poll
10.00 0.004000 571 7 4 restart_syscall
0.00 0.000000 0 352 mprotect
0.00 0.000000 0 1 madvise
0.00 0.000000 0 178 wait4
0.00 0.000000 0 36 gettimeofday
0.00 0.000000 0 36 getrusage
0.00 0.000000 0 177 gettid
0.00 0.000000 0 748 clock_gettime
------ ----------- ----------- --------- --------- ----------------
100.00 0.040000 3240 259 total
0.00user 0.02system 0:17.90elapsed 0%CPU (0avgtext+0avgdata 2444maxresident)k
0inputs+0outputs (0major+159minor)pagefaults 0swaps
Here are the actual sequences:
[pid 21068] wait4(267, <unfinished ...>
[pid 4643] <... futex resumed> ) = 0
[pid 21068] <... wait4 resumed> 0x7ffa937fd5e4, WNOHANG, NULL) = 0
[pid 4643] clock_gettime(CLOCK_REALTIME, <unfinished ...>
[pid 21068] mprotect(0x7ffb4d794000, 4096, PROT_READ|PROT_WRITE <unfinished ...>
[pid 4643] <... clock_gettime resumed> {1469706279, 262498381}) = 0
[pid 21068] <... mprotect resumed> ) = 0
[pid 21068] mprotect(0x7ffb4d794000, 4096, PROT_NONE <unfinished ...>
[pid 4643] clock_gettime(CLOCK_REALTIME, <unfinished ...>
[pid 21068] <... mprotect resumed> ) = 0
[pid 4643] <... clock_gettime resumed> {1469706279, 262641983}) = 0
[pid 21068] gettid( <unfinished ...>
[pid 4643] futex(0x1285614, FUTEX_WAIT_BITSET_PRIVATE|FUTEX_CLOCK_REALTIME, 7989, {1469706279, 362641983}, ffffffff <unfinished ...>
[pid 21068] <... gettid resumed> ) = 915
[pid 21068] clock_gettime(CLOCK_MONOTONIC, {83226, 548594884}) = 0
[pid 21068] clock_gettime(CLOCK_REALTIME, {1469706279, 262839286}) = 0
[pid 21068] futex(0x7ffa84003044, FUTEX_WAIT_BITSET_PRIVATE|FUTEX_CLOCK_REALTIME, 149, {1469706299, 262839286}, ffffffff <unfinished ...>
[pid 4643] <... futex resumed> ) = -1 ETIMEDOUT (Connection timed out)
[pid 4643] futex(0x12855e8, FUTEX_WAKE_PRIVATE, 1) = 0
[pid 4643] futex(0x7ffaa0091664, FUTEX_WAKE_OP_PRIVATE, 1, 1, 0x7ffaa0091660, {FUTEX_OP_SET, 0, FUTEX_OP_CMP_GT, 1} <unfinished ...>
[pid 4682] <... futex resumed> ) = 0
[pid 4643] <... futex resumed> ) = 1
[pid 4682] futex(0x7ffaa0091638, FUTEX_WAIT_PRIVATE, 2, NULL <unfinished ...>
[pid 4643] futex(0x7ffaa0091638, FUTEX_WAKE_PRIVATE, 1 <unfinished ...>
[pid 4682] <... futex resumed> ) = -1 EAGAIN (Resource temporarily unavailable)
[pid 4643] <... futex resumed> ) = 0
[pid 4682] futex(0x7ffaa0091638, FUTEX_WAKE_PRIVATE, 1) = 0
[pid 4643] futex(0x1285614, FUTEX_WAIT_PRIVATE, 7991, NULL <unfinished ...>
[pid 4682] futex(0x1285614, FUTEX_WAKE_OP_PRIVATE, 1, 1, 0x1285610, {FUTEX_OP_SET, 0, FUTEX_OP_CMP_GT, 1} <unfinished ...>
[pid 4643] <... futex resumed> ) = -1 EAGAIN (Resource temporarily unavailable)
[pid 4682] <... futex resumed> ) = 0
[pid 4643] futex(0x12855e8, FUTEX_WAIT_PRIVATE, 2, NULL <unfinished ...>
[pid 4682] futex(0x12855e8, FUTEX_WAKE_PRIVATE, 1 <unfinished ...>
[pid 4643] <... futex resumed> ) = -1 EAGAIN (Resource temporarily unavailable)
[pid 4682] <... futex resumed> ) = 0
[pid 4643] futex(0x12855e8, FUTEX_WAKE_PRIVATE, 1 <unfinished ...>
[pid 4682] wait4(267, <unfinished ...>
For those who want to play around with this version of libcurl
on EL7, here is what I did to get it:
# EPEL repo is for libnghttp2.so.14 required by new libcurl
echo "Installing curl" \
&& yum -y install epel-release \
&& rpm -Uvh http://www.city-fan.org/ftp/contrib/yum-repo/rhel7/x86_64/city-fan.org-release-1-13.rhel7.noarch.rpm \
&& yum -y install libcurl
# rpm -qa | grep libcurl
libcurl-7.50.0-2.0.cf.rhel7.x86_64
Of course this repository and this build of libcurl
will never be approved by anyone for production in most shops I think so lets not have this as a solution please :)
Oh and this may be something in underlying libraries, not libcurl
itself.
For completeness, here the list of dependencies.
The libcurl-7.50.0-2.0.cf.rhel7.x86_64
pulls in these:
=============================================================================================
Package Arch Version Repository Size
=============================================================================================
Updating:
libcurl x86_64 7.50.0-2.0.cf.rhel7 city-fan.org 377 k
Installing for dependencies:
libicu x86_64 50.1.2-15.el7 base 6.9 M
libmetalink x86_64 0.1.2-9.rhel7 city-fan.org 25 k
libnghttp2 x86_64 1.7.1-1.el7 epel 61 k
libpsl x86_64 0.7.0-1.el7 city-fan.org 45 k
Updating for dependencies:
curl x86_64 7.50.0-2.0.cf.rhel7 city-fan.org 414 k
libssh2 x86_64 1.7.0-5.0.cf.rhel7 city-fan.org 102 k
Transaction Summary
=============================================================================================
Install ( 4 Dependent packages)
Upgrade 1 Package (+2 Dependent packages)
Library dependencies. Standard Centos 7.2: libcurl-7.29.0-25.el7.centos.x86_64
:
# ldd /lib64/libcurl.so.4
linux-vdso.so.1 => (0x00007ffedfeef000)
libidn.so.11 => /lib64/libidn.so.11 (0x00007f88331a8000)
libssh2.so.1 => /lib64/libssh2.so.1 (0x00007f8832f7e000)
libssl3.so => /lib64/libssl3.so (0x00007f8832d3a000)
libsmime3.so => /lib64/libsmime3.so (0x00007f8832b13000)
libnss3.so => /lib64/libnss3.so (0x00007f88327ed000)
libnssutil3.so => /lib64/libnssutil3.so (0x00007f88325c0000)
libplds4.so => /lib64/libplds4.so (0x00007f88323bc000)
libplc4.so => /lib64/libplc4.so (0x00007f88321b7000)
libnspr4.so => /lib64/libnspr4.so (0x00007f8831f78000)
libpthread.so.0 => /lib64/libpthread.so.0 (0x00007f8831d5c000)
libdl.so.2 => /lib64/libdl.so.2 (0x00007f8831b58000)
libgssapi_krb5.so.2 => /lib64/libgssapi_krb5.so.2 (0x00007f883190b000)
libkrb5.so.3 => /lib64/libkrb5.so.3 (0x00007f8831626000)
libk5crypto.so.3 => /lib64/libk5crypto.so.3 (0x00007f88313f4000)
libcom_err.so.2 => /lib64/libcom_err.so.2 (0x00007f88311ef000)
liblber-2.4.so.2 => /lib64/liblber-2.4.so.2 (0x00007f8830fe0000)
libldap-2.4.so.2 => /lib64/libldap-2.4.so.2 (0x00007f8830d8d000)
libz.so.1 => /lib64/libz.so.1 (0x00007f8830b76000)
libc.so.6 => /lib64/libc.so.6 (0x00007f88307b4000)
libssl.so.10 => /lib64/libssl.so.10 (0x00007f8830547000)
libcrypto.so.10 => /lib64/libcrypto.so.10 (0x00007f883015e000)
librt.so.1 => /lib64/librt.so.1 (0x00007f882ff56000)
/lib64/ld-linux-x86-64.so.2 (0x0000559d442ae000)
libkrb5support.so.0 => /lib64/libkrb5support.so.0 (0x00007f882fd46000)
libkeyutils.so.1 => /lib64/libkeyutils.so.1 (0x00007f882fb42000)
libresolv.so.2 => /lib64/libresolv.so.2 (0x00007f882f928000)
libsasl2.so.3 => /lib64/libsasl2.so.3 (0x00007f882f70a000)
libselinux.so.1 => /lib64/libselinux.so.1 (0x00007f882f4e5000)
libcrypt.so.1 => /lib64/libcrypt.so.1 (0x00007f882f2ad000)
libpcre.so.1 => /lib64/libpcre.so.1 (0x00007f882f04c000)
liblzma.so.5 => /lib64/liblzma.so.5 (0x00007f882ee27000)
libfreebl3.so => /lib64/libfreebl3.so (0x00007f882ec23000)
Library dependencies. Updated libcurl-7.50.0-2.0.cf.rhel7.x86_64
:
# ldd /lib64/libcurl.so.4
linux-vdso.so.1 => (0x00007ffc2e384000)
libnghttp2.so.14 => /lib64/libnghttp2.so.14 (0x00007fddeea36000)
libidn.so.11 => /lib64/libidn.so.11 (0x00007fddee803000)
libssh2.so.1 => /lib64/libssh2.so.1 (0x00007fddee5d5000)
libpsl.so.0 => /lib64/libpsl.so.0 (0x00007fddee35d000)
libssl3.so => /lib64/libssl3.so (0x00007fddee11a000)
libsmime3.so => /lib64/libsmime3.so (0x00007fddedef2000)
libnss3.so => /lib64/libnss3.so (0x00007fddedbcc000)
libnssutil3.so => /lib64/libnssutil3.so (0x00007fdded9a0000)
libplds4.so => /lib64/libplds4.so (0x00007fdded79b000)
libplc4.so => /lib64/libplc4.so (0x00007fdded596000)
libnspr4.so => /lib64/libnspr4.so (0x00007fdded358000)
libpthread.so.0 => /lib64/libpthread.so.0 (0x00007fdded13b000)
libdl.so.2 => /lib64/libdl.so.2 (0x00007fddecf37000)
libgssapi_krb5.so.2 => /lib64/libgssapi_krb5.so.2 (0x00007fddecceb000)
libkrb5.so.3 => /lib64/libkrb5.so.3 (0x00007fddeca05000)
libk5crypto.so.3 => /lib64/libk5crypto.so.3 (0x00007fddec7d3000)
libcom_err.so.2 => /lib64/libcom_err.so.2 (0x00007fddec5cf000)
liblber-2.4.so.2 => /lib64/liblber-2.4.so.2 (0x00007fddec3bf000)
libldap-2.4.so.2 => /lib64/libldap-2.4.so.2 (0x00007fddec16c000)
libz.so.1 => /lib64/libz.so.1 (0x00007fddebf56000)
libc.so.6 => /lib64/libc.so.6 (0x00007fddebb93000)
libssl.so.10 => /lib64/libssl.so.10 (0x00007fddeb926000)
libcrypto.so.10 => /lib64/libcrypto.so.10 (0x00007fddeb53e000)
libicuuc.so.50 => /lib64/libicuuc.so.50 (0x00007fddeb1c4000)
libicudata.so.50 => /lib64/libicudata.so.50 (0x00007fdde9bf0000)
librt.so.1 => /lib64/librt.so.1 (0x00007fdde99e7000)
/lib64/ld-linux-x86-64.so.2 (0x0000558fea944000)
libkrb5support.so.0 => /lib64/libkrb5support.so.0 (0x00007fdde97d8000)
libkeyutils.so.1 => /lib64/libkeyutils.so.1 (0x00007fdde95d4000)
libresolv.so.2 => /lib64/libresolv.so.2 (0x00007fdde93b9000)
libsasl2.so.3 => /lib64/libsasl2.so.3 (0x00007fdde919c000)
libstdc++.so.6 => /lib64/libstdc++.so.6 (0x00007fdde8e93000)
libm.so.6 => /lib64/libm.so.6 (0x00007fdde8b91000)
libgcc_s.so.1 => /lib64/libgcc_s.so.1 (0x00007fdde897b000)
libselinux.so.1 => /lib64/libselinux.so.1 (0x00007fdde8755000)
libcrypt.so.1 => /lib64/libcrypt.so.1 (0x00007fdde851e000)
libpcre.so.1 => /lib64/libpcre.so.1 (0x00007fdde82bc000)
liblzma.so.5 => /lib64/liblzma.so.5 (0x00007fdde8097000)
libfreebl3.so => /lib64/libfreebl3.so (0x00007fdde7e94000)
Thanks for trying that, @ppanyukov!
@ppanyukov, @crozone, do either of you have an SSE source I can use to reproduce this myself? Either a public server somewhere, or a local server I can run, would work.
@ericeil I can build one for you easily with all the VSTS agent things we are using in Azure if you give me your public ssh key. Email ppanyukov at googlemail.com, we can discuss offline.
The tricky part with our setup is you would need a visualstudio.com thing to do builds, which I can also setup for you. Or you can get your own. Again, can discuss offline.
PS. Oh, and what is SSE source
? Ah never mind, got it. Well the visualstudio.com would be SSE source for us. Otherwise we don't have anything ready.
@ppanyukov, thanks for the offer of help! I'm going to be away from GitHub for the next couple of weeks, but I'll follow up with you after that.
Just wanting to give this a bit of a prod to make sure it doesn't get forgotten :)
I'm going to write a basic event source server using kestrel for everyone to test against - give me a day or so and I'll have something to repro with.
I'm going to write a basic event source server using kestrel for everyone to test against
Thanks, @crozone! I'm back from vacation now, so I can take a look at this; just let me know when you have something I can try.
https://github.com/crozone/EventSourceDemo
All done, this is a basic event source server that hosts an event source at the URL /EventSource
, and pushes down the current time in the data every second.
It is currently implemented by using the standard MVC pipeline to route requests to the EventSource action. The action sets the correct header types and then enters into a loop, async waiting on a semaphore slim to indicate that a message queue has been populated. The loop pops messages off the queue and pushes them as event source formatted plaintext down the response stream. AFAIK, this behaves correctly, but if anyone has a better way of implementing an event source (without involving SignalR), let me know.
The Index page also has some javascript that connects to it and displays the messages as part of the DOM - this works on Firefox and Chrome, but not on Edge just yet (since Edge doesn't support event sources).
Thank you again, @crozone. With this event source, I'm able to reproduce the problem on my machine! I'll investigate the CPU usage now....
We seem to get stuck in a loop calling curl_multi_wait
, which apparently returns immediately. My current hunch is that this is due to the problem fixed in libcurl with this commit. The "extra" fd we pass in has data available for reading, but we don't realize this because the revents
field for that fd never gets set correctly. So we don't actually read the data from the fd, so it always polls true.
It looks like that libcurl issue was fixed in libcurl 7.32.0. Would using that version be a viable option for everyone here?
Not an option here - as it would vary the installations from the upstream vendor.
I have however started a bugzilla request with RedHat in an attempt to have the fix backported into the official RHEL packages: https://bugzilla.redhat.com/show_bug.cgi?id=1367614
If someone closer to the debugging of this issue than me is able to add more technical details to assist the RH team, this may be helpful to them.
We should probably proceed by grabbing the libcurl/7.29.0 source, making a patch for it (add the code from https://github.com/curl/curl/commit/6d30f8ebed34e7276c2a59ee20d466bff17fee56), build and test, and then submit that to the bugzilla report so it can be patched.
Ok - this is turning out to be more complex than I thought.... The EL7 packages already have a heap of patches that touch lib/multi.c. One of those patches is messing up with my application of the ported patch to these sources.
To make things more complex, because of previous patches, I can't even just rip the entire routine out and put in the modified one from 7.32.
As I'm not a native C coder, I'm only having a guess as to what is gong on to try and manually port it back - so I'm a little out of my depth in knowing if what I'm attempting is correct.
For reference of the issue history, the current tree used to create the RHEL package is at: https://git.centos.org/tree/rpms!curl/5522008c68b4e4b077c312f163d6f925e752437c
You'll see the many patches in the SOURCES directory. It might be better off waiting for someone much more familiar with libcurl to take a peek at this.
In related news, seems this is already known in Bugzilla by RH in a different report: https://bugzilla.redhat.com/show_bug.cgi?id=1347904
In a nutshell, scheduled to be fixed in RHEL 7.4.
Looking at the listed commits in the other bug report, it may actually come close to what I've done in the patch in my previous comment.
As such, people probably have 2 options now - test with my patch above (which myself and @crozone will try), or wait for the RHEL 7.4 release to drop.
:money_with_wings: sweet.
Ok - test results are a failure.... Seems that it causes a segfault in the dotnet runtime now in the HTTP library.
Unless someone wants to go through at cherry-pick the patches from RH BZ #1347904, then it may well be a case of waiting until RHEL 7.4 drops.
(I've deleted my comment with patch so nobody else wastes their time down this path)
Another option would be to add our own poll call here: https://github.com/dotnet/corefx/blob/7e2bd07936179c192e682d979b2938b4a7e32030/src/Native/Unix/System.Net.Http.Native/pal_multi.cpp#L72 This would paper-over the old libcurl bug at the expense of an additional poll call in some circumstances.
Another option would be to add our own poll call
Yes, I was thinking the same thing. I'll put together a PR.
Ok, I cherry picked the commits against the RHEL version of curl 7.29. These built successfully and seem to fix the problem as documented in this thread.
I uploaded the new packages to: http://au1.mirror.crc.id.au/repo/el7-testing/x86_64/
I called these: curl-7.29.0-25.1.el7 libcurl-7.29.0-25.1.el7
From what I can gather, the fixed redhat version will be curl-7.29.0-32.el7. This means when this package hits, the ones I built will be replaced by the official redhat version.
From initial testing, seems we no longer get 100% CPU usage, however handing over to @crozone for functionality testing...
@stephentoub & @ericeil - my thoughts at the moment would be to leave this as is. RedHat will have a fixed package in distribution at some point in the timeline. I've done a shortcut by patching the existing curl packages and happy to have these packages available for anyone to test.
I think adding extra complexity to the dotnet core stuff may be just extra cruft. Happy to hear thoughts from others - but if this is fixed by my packages AND RedHat will have an update that fixes the problem in the near future, then I feel this is the better path forward.
Also, would be good if others involved can test & advise if it fixes the problem in their use case.
Running 1.0.0-preview2-003121 on Scientific Linux release 7.2 (Nitrogen), which
dotnet --info
identifies as (OS Name: rhel, OS Version: 7.2, OS Platform: Linux, RID: rhel.7.2-x64).In order to read from a SSE EventSource (which is basically just a HTTP GET that doesn't close immediately), my code currently:
HttpClient
HttpRequestMessage
and uses it in aSendAsync()
on theHttpClient
to get aHttpResponseMessage
HttpResponseMessage.Content.ReadAsStreamAsync()
ReadAsync()
on theStream
, which is then fed into a decoder.The issue is that on RHEL 7.2,
await ReadAsync()
blocks (as expected), but with 100% CPU used by thedotnet
process. This also occurs if theReadAsync()
is replaced by the standard blockingRead()
. To my untrained eye, it appears to be spin waiting or spin locking, or something in that vain.I have tested and confirmed that the issue does not manifest itself on Windows 10 (build 14372) or Ubuntu 14.04.
I have attached a small example project that reproduces the issue. It is unfortunately slightly cumbersome to run, since an event source must be hosted for it to read, and I can't find an existing example event source server to point it to.
RHELBugTest.zip