Azure / azure-storage-cpp

Microsoft Azure Storage Client Library for C++
http://azure.github.io/azure-storage-cpp
Apache License 2.0
131 stars 147 forks source link

Upload of page blob with stream size >4MB fails silently #404

Open ShippyMSFT opened 3 years ago

ShippyMSFT commented 3 years ago

Recently we noticed an issue where uploads of page blobs using a stream size of more than 4MB were completing unrealistically fast. Through some extra investigation we found that 4MB is the max, but the SDK does not throw any error to us in this case so it appears everything completed really quickly.

In istream_descriptor::create() it correctly throws std::invalid_argument in this case:

if (length != std::numeric_limits<utility::size64_t>::max() && length > max_length)
{
    throw std::invalid_argument(protocol::error_stream_length);
}

However, the implementation in basic_cloud_page_blob_ostreambuf::upload_buffer() doesn't set any current exception info:

            try
            {
                this_pointer->m_blob->upload_pages_async_impl(buffer->stream(), offset, buffer->content_checksum(), this_pointer->m_condition, this_pointer->m_options, this_pointer->m_context, this_pointer->m_cancellation_token, this_pointer->m_use_request_level_timeout, this_pointer->m_timer_handler).then([this_pointer] (pplx::task<void> upload_task)
                {
                    std::lock_guard<async_semaphore> guard(this_pointer->m_semaphore, std::adopt_lock);
                    try
                    {
                        upload_task.wait();
                    }
                    catch (const std::exception&)
                    {
                        this_pointer->m_currentException = std::current_exception();
                    }
                });
            }
            catch (...)
            {
                this_pointer->m_semaphore.unlock();  <-- We end up here
            }

Since m_currentException is not set here, the error is not processed at the end in basic_cloud_blob_ostreambuf::_sync() and it returns as if the operation were successful. While it can be worked around once the issue is understood, it is a data integrity issue if gone unnoticed.

Jinming-Hu commented 3 years ago

@ShippyMSFT Thanks for reporting the issue. I'm going to look into this.

Jinming-Hu commented 3 years ago

Hi @ShippyMSFT , I tested with the code below

concurrency::streams::istream input_stream = concurrency::streams::container_stream<std::vector<uint8_t>>::open_istream(std::vector<uint8_t>(5 * 1024 * 1024, 'a'));
        try
        {
          blob1.upload_pages(input_stream, 0, checksum());
        }
        catch (const std::exception& e)
        {
          ucout << _XPLATSTR("Error: ") << e.what() << std::endl;
        }

and it threw an exception. The error message is

Error: The length of the stream exceeds the permitted length.

Which API are you using when you hit this issue? Can you share steps to reproduce it?

ShippyMSFT commented 3 years ago

We are using cloud_page_blob::upload_from_stream_async() to upload. The stream we use varies, but this can definitely reproduce with an istream created with something like this: concurrency::streams::istream testStream = concurrency::streams::file_stream<uint8_t>::open_istream(filePath).get();

Jinming-Hu commented 3 years ago

Hi @ShippyMSFT , I tried again with code below, still works.

        concurrency::streams::istream input_stream = concurrency::streams::container_stream<std::vector<uint8_t>>::open_istream(std::vector<uint8_t>(5 * 1024 * 1024, 'a'));
        try
        {
          blob1.upload_from_stream_async(input_stream).wait();
        }
        catch (const std::exception& e)
        {
          ucout << _XPLATSTR("Error: ") << e.what() << std::endl;
        }
ShippyMSFT commented 3 years ago

That's bizarre. I'll dig a bit and see if I can figure something out.