Closed dmelladom closed 7 years ago
I think the extractor breaks, not caused by the url but caused for missing tags expected as: "tituloVideo" as is found in the exact line it fails:
elpais.py (line 69)
title = self._html_search_regex( (r"tituloVideo\s*=\s*'([^']+)'", webpage, 'title', r'<h2 class="entry-header entry-title.*?>(.*?)</h2>'), webpage, 'title')
I'll take a deeper look.
EDIT: Yes, I confirm is when trying to get the title attribute, I'll try to fix this and send a PR.
Thanks a lot! Diego
El 15 feb 2017, a las 22:40, Martín Cerdeira notifications@github.com escribió:
I think the extractor breaks, not caused by the url but caused for missing tags expected as: "tituloVideo" as is found in the exact line it fails: title = self._html_search_regex( (r"tituloVideo\s=\s'([^']+)'", webpage, 'title', r'<h2 class="entry-header entry-title.?>(.?)'), webpage, 'title') I'll take a deeper look.
— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub, or mute the thread.
Please follow the guide below
x
into all the boxes [ ] relevant to your issue (like that [x])Make sure you are using the latest version: run
youtube-dl --version
and ensure your version is 2017.02.14. If it's not read this FAQ entry and update. Issues with outdated version will be rejected.Before submitting an issue make sure you have:
What is the purpose of your issue?
The following sections concretize particular purposed issues, you can erase any section (the contents between triple ---) not applicable to your issue
If the purpose of this issue is a bug report, site support request or you are not completely sure provide the full verbose output as follows:
Add
-v
flag to your command line you run youtube-dl with, copy the whole output and insert it here. It should look similar to one below (replace it with your log inserted between triple ```):If the purpose of this issue is a site support request please provide all kinds of example URLs support for which should be included (replace following example URLs by yours):
Description of your issue, suggested solution and other information
Looks like path extraction with categories doesn't work properly. When the link has subcategories such as .../programa_la_voz_de_inaki/... .../seccion_libros/... .../categoria_tecnologia/... .../categoria_ocio_y_cultura/... .../categoria_ciencia/... .../categoria_estilo_de_vida/... .../seccion_gastronomia/... .../categoria_estilo_de_vida/...
Could it be a problem with underscores (_) ?