Hamuko / cum

comic updater, mangafied
Apache License 2.0
171 stars 15 forks source link

Fix mangadex scraper (paginated chapters) #57

Closed mxnemu closed 6 years ago

mxnemu commented 6 years ago

Mangadex changed it's chapter lists to be paginated. This is a quick fix so that it at least works for page 1 without crashing.

For some reason there is an a tag without any href around the page number and this causes this bug.

mxnemu commented 6 years ago

I made the lines shorter, but I'm not sure if this is the accepted style for if expressions in python.

test_chapter_information_tomochan

I have no clue why this fails on travis. It passes locally for me and this merge request should have addressed the issue with this test. Any ideas?

I found another bug where '/chapter/1235/comments' would be matched as a chapter url. I fixed it by appending (?:/[^a-zA-Z0-9]|/?$) to the MangadexChapter regex.

All tests pass locally for me:

stay-gold 12
  [>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>]  34/34  100%             
stay-gold Extra Stuff
  [>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>]  11/11  100%             
saiki-kusuo-no-psi-nan 265
  [>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>]  15/15  100%             
saiki-kusuo-no-psi-nan 279
  [>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>]  25/25  100%             
saiki-kusuo-no-psi-nan 278
  [>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>]  15/15  100%             
...hidamari-sketch 001-013
  [>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>]  150/150  100%             
.ramen-daisuki-koizumi-san 18
  [>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>]  8/8  100%             
.tomo-chan-wa-onna-no-ko 1
  [>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>]  1/1  100%
....
----------------------------------------------------------------------
Ran 9 tests in 111.488s
Hamuko commented 6 years ago

Finally got this done. Thanks for the PR.