Closed nerg4l closed 2 years ago
Pagination on MAL could include an unescaped < which causes the crawler to create an incorrect HTML DOM. This PR contains a change which handles both the correct and the incorrect DOM.
<
Original DOM from MAL:
<div class="pagination ac"> <a class="link" href="https://myanimelist.net/anime/516/Keroro_Gunsou/episode?">1 - 100 </a><span class="skip"><</span> <a class="link" href="https://myanimelist.net/anime/516/Keroro_Gunsou/episode?offset=100">101 - 200</a> <a class="link" href="https://myanimelist.net/anime/516/Keroro_Gunsou/episode?offset=200">201 - 300</a> <a class="link current" href="https://myanimelist.net/anime/516/Keroro_Gunsou/episode?offset=300">301 - 358</a> </div>
Crawler DOM:
<div class="pagination ac"> <a class="link" href="https://myanimelist.net/anime/516/Keroro_Gunsou/episode?">1 - 100</a> <span class="skip"> <a class="link" href="https://myanimelist.net/anime/516/Keroro_Gunsou/episode?offset=100">101 - 200</a> <a class="link" href="https://myanimelist.net/anime/516/Keroro_Gunsou/episode?offset=200">201 - 300</a> <a class="link current" href="https://myanimelist.net/anime/516/Keroro_Gunsou/episode?offset=300">301 - 358</a> </span> </div>
Fixes #439
Pagination on MAL could include an unescaped
<
which causes the crawler to create an incorrect HTML DOM. This PR contains a change which handles both the correct and the incorrect DOM.Original DOM from MAL:
Crawler DOM:
Fixes #439