elpendor / ES-scraper

A scraper for EmulationStation
47 stars 41 forks source link

AttributeError: 'NoneType' object has no attribute 'text' #7

Closed transcendtient closed 11 years ago

transcendtient commented 11 years ago

When trying to run the scraper I get a few games found, and it downloads their box art, but I get this error on the third title it finds.

Game Found:Batman Traceback (most recent call last): File "scraper.py" in line 246 in getGameData(ES_systems[i][1],ES_systems[i][2],ES_systems[i][3]) File "scraper.py", line 174, in getGameData if imgNode.text is not None and args.noimg is False: AttributeError: 'NoneType' object has no attribute 'text'

transcendtient commented 11 years ago

I think this happens with a bad rom. I'm running a scrape -noimg to find the 2nd bad rom now(i renamed batman.zip to batman.bak to skip it), but it's still pretty slow going to find the broken roms in here since you have to rerun it every time. Maybe dynamic updating of the gamelist is possible to get a partial working list so you avoid rescanning if you haven't completed a first scan and added a large number of roms at once? Also it doesn't search for roms in alphabetical order for some reason? if I knew how the check picks roms I'd cut my work down alot as well.

elpendor commented 11 years ago

@transcendtient

  1. Seems like a simple check error. I'm currently rewriting/cleaning the whole scraper code (or at least most of it). I'd like you to try again after I commit the changes.
  2. Define "complete a scan". Do you mean "successfully identified a game" or "successfully identified all games for a system"? The way it works now is that it'll save the games into the gamelist after all files have been scanned.
    Opening/editing the file each time a game is found seems kinda write-intensive to me but I guess I could try to catch any exceptions and, if there's an error somewhere, save what's already been identified.
  3. I always assumed python's os.walk got the files alphabetically. I'll look into it.
elpendor commented 11 years ago

I commited some of the bug fixes and cleaner code and added alphabetical scanning. Let me know what your results are.

I'll try to clean up the exporting code later and see what I can do about updating the gamelist in the event of an error.

transcendtient commented 11 years ago

1) After I find the next bad rom I will.

2) I meant complete for all games in a system. You see I sent over alot of roms at once...

Some other thoughts: If it wrote as it went along each time a bad file was found it would not have to rescan the whole library as your script already skips roms with entries (I believe, I just started using the verbose -v switch) Even better it you wrote a flag to the xml or a separate file that a file was scanned you could skip already scanned files, and you could rescan unidentified files from the list with a separate switch on the command line.

elpendor commented 11 years ago
  1. Does Batman still give errors? It shouldn't.
  2. Yeah, I figured that much.

If it wrote as it went along each time a bad file was found it would not have to rescan the whole library as your script already skips roms with entries (I believe, I just started using the verbose -v switch)

I don't understand your "bad file" concept. Please elaborate on that.

Yeah, if a rom is already in the gamelist, it'll just skip it.

Even better it you wrote a flag to the xml or a separate file that a file was scanned you could skip already scanned files, and you could rescan unidentified files from the list with a separate switch on the command line.

That makes sense. I'll see what I can do.

transcendtient commented 11 years ago

Now I'm getting errors on titles I was finding information on. It finds information on roms in the database that start with numbers, but not letters I believe. I've ran through and it's giving me this same error for all but the first couple roms it finds information for.

ajax.zip and aliens.zip and more I haven't looked for yet.

File scraper.py line 301, in module scanFiles(ES_systems[i]) File scraper.py line 221, in scanFiles lst_genres=getGenres(result) File scraper.py line 150 in getGenres for item in modes.find("Genres").iter("genre"); AttributeError: 'NoneType" object has no attribute 'iter'

EDIT ABOVE

elpendor commented 11 years ago

Huh, that's weird.

Filename? Arguments used?

EDIT: Commited an extra check in the scraper. Update and tell me if it still happens.

transcendtient commented 11 years ago

Yes it's finding more roms now. I renamed my files to .zip to recheck them.

using python scraper.py -v -w 300

It didn't download art for batman, only said Game Found: Batman

EDIT: The error was coming from titles it doesn't download art for

elpendor commented 11 years ago

It didn't download art for batman, only said Game Found: Batman

If it didn't, then there's no boxart to download in that DB.

transcendtient commented 11 years ago

No I mean all the titles I originally had problems with are the titles it's not finding art for. Batman, and Twin Eagle II.

But, I think you fixed the problem man, you're a wizard.

I'm still reading the list of files and no errors yet, NOTE It did kick me out when I ran it from retropiesetup.sh

transcendtient commented 11 years ago

So this is fixed for sure, it finished scanning the entire system. It didn't find but about 2-5% of my roms though, and I was wondering what I could do to increase the chances it finds my rom in the database.

elpendor commented 11 years ago

It didn't find but about 2-5% of my roms though, and I was wondering what I could do to increase the chances it finds my rom in the database.

https://github.com/petrockblog/RetroPie-Setup/wiki/ES-scraper

The last section should be helpful. Basically the files need proper names.

Alternatively, you could try CRC scraping.