jirentabu / crashrpt

Automatically exported from code.google.com/p/crashrpt
0 stars 0 forks source link

Multi-part crash reports delivery with libcurl #13

Closed GoogleCodeExporter closed 9 years ago

GoogleCodeExporter commented 9 years ago
1) In httpsend.cpp in the _Send(...) function, there is this code:

  if(atoi(szResponce)!=200)
  {
    an->SetProgress(_T("Failed"), 100, false);
    goto exit;
  }

I believe this is in error. It is not reliable to expect the first few
characters of a web page to contain the status code. In fact, unless I
comment out the above lines of code the HTTP upload never works. I think
this code can safely be removed. Comments?

2) The current HTTP upload mechanism only supports base64 encoding. This is
problematic when sending large files (when dumping the entire process
memory space for example.) There are a couple of problems with using base64
in conjunction with sending large files: 1) the file has to be base64
encoded first and certain characters have to be replaced. This is extremely
slow on large files. Consider this code:

  sPOSTRequest = base64_encode(uchFileData, uFileSize).c_str();
  sPOSTRequest = szPrefix + sPOSTRequest + szSuffix;  
  sPOSTRequest.Replace(_T("+"), _T("%2B"));
  sPOSTRequest.Replace(_T("/"), _T("%2F"));  

Those Replace(...) function calls are computationally expensive with large
strings. 2) Another issue is server side performance. Presently, when
posting large crash dumps the server has to copy the entire base64 upload
into memory as one contigious block. If several users are posting large
crash dumps this could crash the server due to memory exhaustion. 

Multi-part uploads would solve most if not all of these problems.

I would suggest that you investigate using libcurl
http://curl.haxx.se/libcurl/ to post the crash report to the remote server.
Writing the code to perform a multi-part post using libcurl is very
straightforward and fast.

Original issue reported on code.google.com by crcod...@gmail.com on 23 Nov 2009 at 11:03

GoogleCodeExporter commented 9 years ago
In addition to libcurl (an example of using libcurl to perform non-blocking 
multipart
upload can be found here: http://curl.haxx.se/libcurl/c/multi-post.html ) I've
attached an example to illustrate how multipart encoding works, in case it 
needs to
be implemented directly.

In the attached zip, there are two files:

  upload_test.html - A simple HTML form containing a file upload field.  There are
also text fields for a description and comments.

  multipart_form_data.txt - This is the full HTTP request that is sent from the
browser to the web server when you click the submit button in the above form.  
In
this case, the file that I selected to upload was the same HTML file 
(upload_test.html).

Note the following details pertaining to multipart form data:

- The content type is multipart/form-data, rather than 
application/x-www-form-urlencoded.

- You must select a "boundary" string.  This string must not occur anywhere in 
the
file you are sending.  The boundary is specified as part of the Content-Type 
header.

- The boundary used in this example is 
"----WebKitFormBoundaryLbY6AhAtDH+gKEMT".  It
is not strictly necessary for this to start with hyphens; any string will do.  
Note,
however, that wherever it is used, there are two additional hyphens at the 
beginning.
 That is because the section start indicator consists of two hyphens followed by the
boundary string.  For example, if the boundary string is "MyBoundary", then you 
start
a section with "--MyBoundary".  Note also that the end marker would be
"--MyBoundary--" in that case (i.e. add two hyphens to both the beginning and 
end of
the boundary string).

- The section markers need to start at the beginning of a line.  So even if the 
file
you're sending is binary, you need to add an end-of-line sequence at the end, 
before
the section marker.  My example file ends with a newline, but you'll note that 
an
extra end-of-line sequence was added, resulting in a blank line before the 
section
marker.

- End-of-line sequences should be CR LF in compliance with RFC 2616.  (But this
doesn't mean changing bare LF characters to CR LF sequences in the uploaded 
file's
contents.)

- You don't need to escape the file's contents in any way.  The file is assumed 
to
start after the first blank line in its section, and it ends when the boundary 
string
(preceded by two hyphens) is encountered.

Original comment by crcod...@gmail.com on 24 Nov 2009 at 3:32

Attachments:

GoogleCodeExporter commented 9 years ago
1) The fragment from CrashRpt documentation: 

"The script should return status of request completion as server responce 
content. In
HTTP/1.0 and since, the first line of the HTTP response is called the status 
line and
includes a numeric status code (such as "404") and a textual reason phrase 
(such as
"Not Found").

If the script succeeds in saving the error report, it should return the "200 
Success"
as server responce content. If the script encounters an error, it should return 
a 4xx
error code, for example "450 Invalid parameter". Note that some error codes are
reserved by HTTP specification. If the script uses custom error codes, they 
shouldn't
intersect with standard predefined HTTP codes.

Note: When creating your own script, be careful with the script's return code. 
If
your script succeeds in saving error report it should return '200 Success'. If 
crash
sending process encounters another error code, it attempts sending the error 
report
using another way. In such situation you may receive the same error report 
several
times through different transport."

2) I will investigate if we can use libcurl. This may be useful.

Original comment by zexspect...@gmail.com on 24 Nov 2009 at 2:44

GoogleCodeExporter commented 9 years ago
Thank you for the information about the server response. 

Please do investigate libcurl, it is an extremely useful library.

Original comment by crcod...@gmail.com on 25 Nov 2009 at 1:24

GoogleCodeExporter commented 9 years ago
I've introduced myself with libcurl. As I understand, libcurl allows to post 
HTTP
requests splitting them into many small parts. This is useful when sending huge 
files. 

I think this feature is very useful, even if you plan to implement the support 
of
huge dumps. Our current delivery transport (usual HTTP and e-mail) are not 
designed
to transfer large files.

Related to feature implementation. Since libcurl distrib is rather large (3.5 
Mb), we
can't include its code in CrashRpt download. We will provide a link to the 
libcurl
download page, so user will be able to download it by himself.

I think libcurl usage should be optional. We shouldn't require user to have 
libcurl
installed to compile CrashRpt. We should introduce some compilation switch 
macro, for
example _USE_LIBCURL to allow user to enable libcurl support.

Original comment by zexspect...@gmail.com on 27 Nov 2009 at 1:01

GoogleCodeExporter commented 9 years ago
After some additional research, I think that multi-part uploads can be 
implemented 
using usual HttpSendRequestEx() and InternetWriteFile() functions provided by 
WinInet. Multipart file uploads are part of HTTP specification and regular 
WinInet 
functions work well enough to handle this task.

It seem that libcurl in not required at all. libcurl seems to be a 
multi-platform 
replacement of some subset of WinInet functionality. 

Original comment by zexspect...@gmail.com on 14 Jan 2010 at 8:57

GoogleCodeExporter commented 9 years ago
Implemented in v.1.2.2

Original comment by zexspect...@gmail.com on 23 Mar 2010 at 5:34