elpendor / ES-scraper

A scraper for EmulationStation
47 stars 41 forks source link

Handle empty XML returned by archive.vg #28

Closed streeto closed 10 years ago

streeto commented 10 years ago

When scraping using CRC, some files yield an empty XML. For example, I have a GBA rom called Castlevania.gba, which has a CRC of 611535DC. When querying for it in archive.vg http://api.archive.vg/2.0/Game.getInfoByCRC/xml/7TTRM4MNTIKR2NNAGASURHJOZJ3QXQC5/611535DC, the result is the following:

<?xml version="1.0" encoding="UTF-8"?>
<OpenSearchDescription xmlns:opensearch="http://a9.com/-/spec/opensearch/1.1/">
  <opensearch:Query searchTerms="611535dc"/>
  <opensearch:totalResults>1</opensearch:totalResults>
  <games>
    <game>
      <id></id>
      <title></title>
      <description></description>
      <genre></genre>
      <developer></developer>
      <esrb_rating></esrb_rating>
      <box_front></box_front>
      <box_front_small></box_front_small>
      <system></system>
      <system_title></system_title>
      <size></size>
      <rom></rom>
      <romName></romName>
    </game>
(...)
  </games>
  <timestamp>1360614791</timestamp>
</OpenSearchDescription>

Where the (...) means a lot more of empty game tags like the first.

This causes the game title and all other data to become "None" in the gamelist.

  <game>
    <path>/Users/andre/games/gba/Castlevania.gba</path>
    <name>None</name>
    <desc>None</desc>
    <image>/Users/andre/games/gba/Castlevania</image>
    <releasedate />
    <publisher />
    <developer>None</developer>
    <genres>
      <genre>None</genre>
    </genres>
  </game>

I am not sure what to do here. Do we code around this bug in archive.vg, or investigate further?

elpendor commented 10 years ago

Sorry it took me a while to answer. Aloshi linked me to this and I couldn't do anything at the time so I quickly checked the commit and merged it to the repo.

Yeah, it would be best to ignore those. I think this should be reported to archive.vg though, that's not a normal result for a a query.