StegSchreck / RatS

Movie Ratings Synchronization with Python
GNU Affero General Public License v3.0
271 stars 31 forks source link

AllocineParser is starting over with less movies #139

Open StegSchreck opened 4 years ago

StegSchreck commented 4 years ago

Describe the bug Allocine starts to parse a certain number of movies on a certain number of pages. Somewhere in the middle, it is starting over with a single page of movies and just 36 movies.

image

Expected behavior Allocine parsing the initially found, complete set of available (rated) movies from Allocine

Desktop (please complete the following information):

Geckodriver log

console.error: SearchCache: "_readCacheFile: Error reading cache file:" (new Error("", "(unknown module)"))
1602949509356   Marionette  WARN    TLS certificate errors will be ignored for this session
JavaScript warning: https://www.allocine.fr/, line 138: unreachable code after return statement
JavaScript warning: https://www.allocine.fr/film/fichefilm_gen_cfilm=198937.html, line 138: unreachable code after return statement
JavaScript warning: https://www.allocine.fr/film/fichefilm_gen_cfilm=232128.html, line 138: unreachable code after return statement
StegSchreck commented 4 years ago

For some reason AllocineParser is restarting the parsing. This might be caused by an AttributeError, which is then handled by https://github.com/StegSchreck/RatS/blob/master/RatS/base/base_ratings_parser.py#L35 In my test runs it was always the same movie where this happens: https://www.allocine.fr/film/fichefilm_gen_cfilm=20010.html