tvgrabbers / tvgrabnlpy

Deze versie is deprecated zie: tvgrabpyAPI
https://github.com/tvgrabbers/tvgrabpyAPI
GNU General Public License v2.0
27 stars 8 forks source link

Sub-genres names interference #71

Closed zapp-it closed 7 years ago

zapp-it commented 7 years ago

First of all, thank you for the excellent work you did with this scraper! I’ve noticed some small “issues”. Sometimes the sub-genres can have an leading space for example: informatief: reportage = informatief: reportage =

Also sometimes the sub-genre can have an dot on the end for example: informatief: reportage. =

hikavdh commented 7 years ago

I guess you're talking about the tv_grab_nl_py.set file? The space there is just layout. I just checked and I see some dots on the tvgids.tv genres. I can add a remove final dot there, but one of the troubles with tvgids.tv is that their genres are a little bit ad-hoc.

zapp-it commented 7 years ago

I appreciate if you can remove the final dot, yes I know that the sub-genres are sometimes a little mess not only on tvgids.tv ;-)

hikavdh commented 7 years ago

I'll add a strip('.') to the tvgids.tv subgenre retrieval in the next release (2.2.20). As in so much words said, spaces are everywhere always striped from start and finish. Not only on the subgenres. And to inform you comparisons are in almost all cases done in lowercase, so case insensitive. All tables in tv_grab_nl_py.set get after reading a .lower().strip() and genres are again capitalized in the xmltv output, so it does not matter if you use extra spaces or capitalization. Also any tvgids.tv subgenre longer then 20 chars is ignored as it probably is a description.