oe-alliance / XMLTV-Import

Import's EPG data from rytec xml data sources.
15 stars 37 forks source link

Malformed(?) XML ignores perfectly good events #34

Closed oottppxx closed 5 years ago

oottppxx commented 5 years ago

Seen some EPG XML files with program entries containing <desc ... /> and <sub-title ... />.

Apparently if description and sub-title are empty (and maybe other fields, those were the only 2 I've seen as such so far - even tested transforming them into <desc ...>, etc...), the program enumeration in the enumFile function try/except block gets an exception and logs a parsing error. The rest of the program is perfectly valid, though, so that's wasteful.

I've tested the below, and it seems to work pretty well. Maybe the same can be applied to other fields of the program, with defaults being whatever makes sense?

--- xmltvconverter.py
+++ xmltvconverter-mod.py
@@ -82,8 +82,14 @@
                start = get_time_utc(elem.get('start'), self.dateParser)
                stop = get_time_utc(elem.get('stop'), self.dateParser)
                title = get_xml_string(elem, 'title')
-               subtitle = get_xml_string(elem, 'sub-title')
-               description = get_xml_string(elem, 'desc')
+                                try:
+                 subtitle = get_xml_string(elem, 'sub-title')
+                                except:
+                                  subtitle = ''
+                                try:
+                 description = get_xml_string(elem, 'desc')
+                                except:
+                                  description = ''
                category = get_xml_string(elem, 'category')
                cat_nr = self.get_category(category,  stop-start)
                # data_tuple = (data.start, data.duration, data.title, data.short_description, data.long_description, data.type)
arn354 commented 5 years ago

https://github.com/oe-alliance/XMLTV-Import/commit/fa46934783aa4d4fb10be3ac744ae87935890977