amauriaces / mp-onlinevideos2

Automatically exported from code.google.com/p/mp-onlinevideos2
0 stars 0 forks source link

Chinese/japanese/korean characters don't display properly for dynamicCategories #35

Closed GoogleCodeExporter closed 9 years ago

GoogleCodeExporter commented 9 years ago
What steps will reproduce the problem?
1. Import the following site

    <Site name="aTV 亞洲電視" util="GenericSite" agecheck="false" enabled="true" lang="zh">
      <Description>aTV 亞洲電視網上電視</Description>
      <Configuration>
        <item key="dynamicCategoriesRegEx"><![CDATA[<td\sheight\s=\s"15"[^\n]*><a\shref=(?<url>[^\>]+)>..(?<title>[^\<]+)</a></TD>]]></item>
        <item key="dynamicCategoryUrlFormatString"><![CDATA[http://app2.hkatv.com{0}]]></item>
        <item key="dynamicCategoryUrlDecoding"><![CDATA[False]]></item>
        <item key="dynamicSubCategoriesRegEx"><![CDATA[]]></item>
        <item key="dynamicSubCategoryUrlFormatString"><![CDATA[]]></item>
        <item key="dynamicSubCategoryUrlDecoding"><![CDATA[False]]></item>
        <item key="videoListRegEx"><![CDATA[<option[\sa-z]+value=(?<VideoUrl>[^\>]+)>(?<Title>[^\n\r]+)]]></item>
        <item key="videoListRegExFormatString"><![CDATA[http://app2.hkatv.com/v3/webtv/{0}]]></item>
        <item key="videoUrlRegEx"><![CDATA[]]></item>
        <item key="videoUrlDecoding"><![CDATA[False]]></item>
        <item key="nextPageRegExUrlDecoding"><![CDATA[False]]></item>
        <item key="prevPageRegExUrlDecoding"><![CDATA[False]]></item>
        <item key="fileUrlRegEx"><![CDATA[]]></item>
        <item key="fileUrlFormatString"><![CDATA[]]></item>
        <item key="baseUrl"><![CDATA[http://app2.hkatv.com/v3/webtv/webtv.php?channel=NewsMenu]]></item>
        <item key="forceUTF8Encoding"><![CDATA[True]]></item>
      </Configuration>
      <Categories />
    </Site>

2. Start online video plugin in media port and select the above site
3. The characters in the categories view are corrupted

What is the expected output? What do you see instead?
The categories should be in chinese characters

What version of the product are you using? On what operating system?
0.23 and latest trunk has the issue.  Mediaportal 1.1RC6 on XP.

Please provide any additional information below.
The web page is encoding in utf-8 but the http response doesn't return charset. 
 In order to decode properly, forceUTF8encoding has to be set to true.  
However, GenericSiteUtil::DiscoverDynamicCategories() doesn't pass the 
forceUTF8Encoding when calling GetWebData().

I was able to fix the problem by modifying the code.  In 
GenericSiteUtil::DiscoverDynamicCategories() [line 117],

replaced with the following fixed the issue:

string data = GetWebData(baseUrl, GetCookie(), null, null, 
forceUTF8Encoding);

Original issue reported on code.google.com by wongsam...@gmail.com on 9 Jul 2010 at 6:14

GoogleCodeExporter commented 9 years ago
You already added:
<item key="forceUTF8Encoding"><![CDATA[True]]></item>
to the Configuration. ;)
I'll add a field to the GenericSiteUtil with that name, and will use that bool 
on every Webrequest in the class, not just getting categories. The encoding 
error should happen on all other pages retrieved from that server.

Original comment by bborgsd...@gmail.com on 9 Jul 2010 at 9:42

GoogleCodeExporter commented 9 years ago
correcting myself: the field is already there, just not used in all cases.

Original comment by bborgsd...@gmail.com on 9 Jul 2010 at 9:49

GoogleCodeExporter commented 9 years ago
This issue was closed by revision r781.

Original comment by bborgsd...@gmail.com on 9 Jul 2010 at 9:53