watson-intu / self

Intu is a Cognitive Embodiment Middleware for AI on the edge.
Other
28 stars 27 forks source link

Self is crashing every 5/10 minutes #45

Closed iportilla closed 6 years ago

iportilla commented 6 years ago

Started last week, several Peppers running self are crashing every 5-10 minutes, nothing in the SelfInstance.log files but several core files in self/latest

~/self/latest $ ls -lta core* -rw------- 1 nao nao 327602176 Mar 22 09:56 core.3659 -rw------- 1 nao nao 338771968 Mar 22 09:55 core.3502 -rw------- 1 nao nao 316096512 Mar 22 09:50 core.3424

gradybooch commented 6 years ago

The robot uprising has begun...

On Mar 22, 2018, at 05:54, Ivan Portilla notifications@github.com wrote:

Started last week, several Peppers running self are crashing every 5-10 minutes, nothing in the SelfInstance.log files but several core files in self/latest

~/self/latest $ ls -lta core* -rw------- 1 nao nao 327602176 Mar 22 09:56 core.3659 -rw------- 1 nao nao 338771968 Mar 22 09:55 core.3502 -rw------- 1 nao nao 316096512 Mar 22 09:50 core.3424

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub, or mute the thread.

takaomoriyama commented 6 years ago

I checked if the same problem could happen on my Pepper. Yesterday, there was no chance, but today the same occurred. Intu dumps core and restarts every few minutes.

By observing console log (not by investigating log file because it misses the very last portion of the log message when the process is aborted), I found that an assertion error occurs at line 785 of lib/cpp-sdk/src/utils/WebClient.cpp.

self_instance: /Users/moriyama/dev/watson-intu/self/lib/cpp-sdk/src/utils/WebClient.cpp:785: void WebClientT<socket_type>::HTTP_OnChunkLength(const boost::system::error_code&, size_t) [with socket_type = boost::asio::ssl::stream<boost::asio::basic_stream_socket<boost::asio::ip::tcp> >, size_t = unsigned int]: Assertion `m_ContentLen > 0' failed.
Aborted (core dumped)

Also found that chunk_length contains "00000000\r" instead of "0\r", which could not be caught by the second if-then clause below, resulting in assertion error.

lib/cpp-sdk/src/utils/WebClient.cpp:

void HTTP_OnChunkLength( const boost::system::error_code & error, size_t bytes_transferred )
{
    if (! error )
    {
        std::istream input(&m_RecvBuffer);
        std::string chunk_length;
        std::getline(input,chunk_length);

        //Log::Status( "WebClient", "Read Chunk Len: %s", chunk_length.c_str() );
        if ( chunk_length == "\r" )
        {
            HTTP_ReadChunkLength();
        }
        else if ( chunk_length == "0\r" )
        {
            // end of chunked content
            HTTP_ReadChunkFooter();
        }
        else
        {
            m_ContentLen = strtoul( chunk_length.c_str(), NULL, 16 );
            assert( m_ContentLen > 0 );

            HTTP_ReadContent( error, 0 );
        }

The fix might be first convert chunk_length (string) into integer, then compare it with zero (number).

PS. chunk_length is chunk length of the HTTP packet when "chunk" Transfer-Encoding mode is selected by the server (it's Watson STT in my case). It is defined as "Hexa-decimal" number in the HTTP spec.

PS2. More interestingly, selection of "chunk" mode and "un-chunk" mode changes even in a single Intu session. Probably this behavior of Watson STT changed recently.

takaomoriyama commented 6 years ago

Console log (with some debug messages added):

[04/03/18 13:14:23.361][STAT][DiscoveryAgent] OnDiscovered instance fa4178b1-bdc5-ffb7-0cf9-abf9c16b7965, IP: 192.168.9.11, 1 instances
[04/03/18 13:14:23.390][STAT][WebClient] HTTP_ReadHeaders - Transfer-Encoding=not-chunked
[04/03/18 13:14:23.404][STAT][WebClient] HTTP_ReadHeaders - Transfer-Encoding=not-chunked
[04/03/18 13:14:23.517][STAT][SelfInstance] Local config saved to ./config.json.
[04/03/18 13:14:23.566][STAT][WebClient] HTTP_ReadHeaders - Transfer-Encoding=not-chunked
[04/03/18 13:14:23.624][STAT][WebClient] HTTP_ReadHeaders - Transfer-Encoding=chunked
[04/03/18 13:14:23.624][STAT][WebClient] Read Chunk Len: 0000005A
[04/03/18 13:14:23.625][STAT][WebClient] Read Chunk Len: 
[04/03/18 13:14:23.625][STAT][WebClient] Read Chunk Len: 00000005
[04/03/18 13:14:23.625][STAT][WebClient] Read Chunk Len: 
[04/03/18 13:14:23.625][STAT][WebClient] Read Chunk Len: 0000005A
[04/03/18 13:14:23.625][STAT][WebClient] Read Chunk Len: 
[04/03/18 13:14:23.626][STAT][WebClient] Read Chunk Len: 0000186E
[04/03/18 13:14:23.626][STAT][WebClient] Read Chunk Len: 
[04/03/18 13:14:23.627][STAT][WebClient] Read Chunk Len: 00000000
self_instance: /Users/moriyama/dev/watson-intu/self/lib/cpp-sdk/src/utils/WebClient.cpp:787: void WebClientT<socket_type>::HTTP_OnChunkLength(const boost::system::error_code&, size_t) [with socket_type = boost::asio::ssl::stream<boost::asio::basic_stream_socket<boost::asio::ip::tcp> >, size_t = unsigned int]: Assertion `m_ContentLen > 0' failed.
Aborted

You might see some HTTP transfer are not-chunked, and some others are chunked.

mgfos207 commented 6 years ago

@takaomoriyama this looks to be related to Watson Services recently no longer supporting TLS 1.0 and 1.1 (https://console.bluemix.net/docs/get-support/appsectls.html#tlssupportwithdraw). The HTTP clients now need to support the Server Name Identification (SNI) TLS extension. I'm wondering if there is a way to determine what version of TLS we are using for the HTTP client and if we can safely upgrade to the 1.2.

takaomoriyama commented 6 years ago

This issue has been fixed by PR #46, and ready to be closed.