Open jacobkreider opened 5 years ago
Maybe there on some inconsistent form in that movie content. I just skip that movie.
Did anyone get around this error? I am having the same one @jacobkreider
Well, I think the problem is in the ? character in the title, I am not sure how to solve this (I am a Python novice), but I found a way around it. I transformed the paragraphs 'href' into a list and then continued the iteration starting from the list member after this film [823] .. you can download the missing script manually
stats=[]
for p in paragraphs:stats.append(p.a['href'])
for p in stats[823:]:
relative_link = p ##continue the code from here as given
The same thing happens with: What About Bob? and Who Framed Roger Rabbit?
I just used "try... except..." to skip the scripts with error. Only a very few scripts got skipped.
Yeah, it has a problem with question marks since it is %3f in the URL. If you look at my fork, I manually skipped over the 3 movies that have a question mark in them and will download those three manually.
Traceback after downloading 'O Brother Where Art Thous? Script.html':
Traceback (most recent call last): File "download_all_scripts.py", line 59, in
title, script = get_script(relative_link)
File "download_all_scripts.py", line 42, in get_script
script_text = script_soup.find_all('td', {'class': "scrtext"})[0].get_text()
IndexError: list index out of range
Not sure what's making it fail