PILLUTLAAVINASH / google-enterprise-connector-manager

Automatically exported from code.google.com/p/google-enterprise-connector-manager
0 stars 0 forks source link

Base64 and URL-encoding are inefficient #27

Closed GoogleCodeExporter closed 8 years ago

GoogleCodeExporter commented 8 years ago
What steps will reproduce the problem?

See below.

What is the expected output? What do you see instead?

I profiled the Livelink connector for performance, and it spent about 20%
of its time in the Livelink connector, and most of the rest, more than 60%
of the time, Base64 and URL-encoding the feed URLs. There are two grossly
inefficient things about the current implementation:

1. The Base64-encoded content stream is URL-encoded. It probably doesn't
need to be, given where the unencoded "=" and "/" characters could appear
in the feed URLs, and it certainly wouldn't need to be if an alternate
Base64-encoding were used that had extra characters that did not need to be
URL-encoded.

2. Both the UrlEncodedFilterInputStream and the Base64FilterInput stream
implement the read(byte[], int, int) method with calls to read(), and
GsaFeedConnection calls read() directly. In my sample run, this meant that
369 calls to sendData turned into 15 million calls to read() on the two
streams combined. My test run spent 6% of its time just checking
"encodedBufEndPos == encodedBufPos" twice on every byte in the stream.

Please use labels and text to provide additional information.

Original issue reported on code.google.com by donald.z...@gmail.com on 24 Jan 2007 at 3:51

GoogleCodeExporter commented 8 years ago
Brian, please add to backlog.

Original comment by donald.z...@gmail.com on 24 Jan 2007 at 3:52

GoogleCodeExporter commented 8 years ago
Google Bug #244002

Original comment by vjo...@gmail.com on 9 Feb 2007 at 5:19

GoogleCodeExporter commented 8 years ago
Fixed issue #1 in r536.

There are two ways to feed content to the GSA via the GSA Feeds Protocol:
application/x-www-form-urlencoded and multipart/form-data.  The GSA currently 
uses
application/x-www-form-urlencoded.  Therefore, it needs to URL-encode all the 
content
it uploads.

The Google Search Appliance Feeds Protocol Developer's Guide
(http://code.google.com/enterprise/documentation/feedsguide.html#pushing_feeds) 
says:

> You should post the feed using enctype="multipart/form-data". 
Although
the Google
Search Appliance supports uploads using
enctype="application/x-www-form-urlencoded",
this encoding type is not recommended for large amounts of data.

As of r536, the connector-manager uses multipart/form-data.

Original comment by tim.g...@gmail.com on 15 Jun 2007 at 5:03

GoogleCodeExporter commented 8 years ago

Original comment by mgron...@gmail.com on 3 Oct 2007 at 11:00