sahib / glyr

Glyr is a music related metadata searchengine, both with commandline interface and C API
GNU Lesser General Public License v3.0
248 stars 24 forks source link

Cut away Lyricwiki clutter on end #93

Open omgold opened 5 years ago

omgold commented 5 years ago

For quite a while lyrics from Lyricwiki append several lines of garbage like "External linksNominate as Song of the Day..."

It seems that the end detection in the html parsing is not good anymore. For me it works better, e.g. like this:

--- lib/intern/lyrics/lyricswiki.c.orig 2019-06-07 13:22:06.149875353 +0200 +++ lib/intern/lyrics/lyricswiki.c 2019-06-07 13:22:17.587382031 +0200 @@ -65,7 +65,7 @@

define LYR_NODE "<div class='lyricbox"

define LYR_BEGIN ">"

-#define LYR_ENDIN "<!--" +#define LYR_ENDIN "<div class='lyricsbreak"

define LYR_FOOTER "<div id=\"songfooter"

define LYR_CREDITS "<table"

define LYR_INSTRUMENTAL "/Category:Instrumental"