googlegsa / sharepoint.v3

Google Search Appliance Connector for SharePoint
5 stars 10 forks source link

"Input is not proper UTF-8" in Feed Data Source log #179

Closed GoogleCodeExporter closed 9 years ago

GoogleCodeExporter commented 9 years ago
What steps will reproduce the problem?
1.Manually install connector manager 2.6.0 and SharePoint connector 2.6.8 as 
per instructions
2.Confgure connector to feed content (Authorization: By Connector) and enable 
SharePoint search visibility options
3.Crawl SharePoint sites (not able to identify what content is causing this 
issue)

What is the expected output? What do you see instead?
I would expect that all content is crawled and fed to the GSA without encoding 
issues in the feed xml file.

instead we see In the Feed Data Source Log: Skipping the rest of the feed, Line 
number: 6871, Error: Input is not proper UTF-8, indicate encoding !
Bytes: 0xE2 0x80 0x3F 0x20

What version of the product are you using? On what operating system?
GSA 6.8.0.30 patch 6
Connector Manager 2.6.0
Connector 2.6.8
Windows 2008 32 bit
SharePoint 2007

Original issue reported on code.google.com by richard....@gmail.com on 4 Apr 2011 at 2:37

GoogleCodeExporter commented 9 years ago
An update to this as it still does not appear to have been looked at:

Sample Metadata from a feed-
<meta name="Project Description" content="themes – “Produced 
Fluids� and “HP/HT Production�."/>

Sample value from SharePoint - 
themes – “Produced Fluids” and “HP/HT Production”.

Original comment by richard....@gmail.com on 11 Aug 2011 at 4:14

GoogleCodeExporter commented 9 years ago
This issue is filed as Google issue #6514005

Original comment by tdnguyen@google.com on 18 May 2012 at 12:30