yanwong / ganon

Automatically exported from code.google.com/p/ganon
0 stars 0 forks source link

Sujestion for avoid: Fatal error: Call to a member function getPlainText() on a non-objec #22

Open GoogleCodeExporter opened 8 years ago

GoogleCodeExporter commented 8 years ago
Hi (again)

This is just a suggestion for improvement

I am making any scrappers to get data of several webs, and. I'm concerned about 
the possibility of
any changes in the structure of the webs that I'm scrapping.

My scrappers do sistematic (and unnatended) work so... I always need to check 
if all the tags what are I 
spected are in the web page and log it for posterior analysis.

With this in my main... I never can concatenate several operations (select, 
getPlainText, etc) because if any of 
the selects returns null, the script crash with the error: 

Fatal error: Call to a member function getPlainText() on a non-object in ...

Sometimes I call to select just for test if a node is present (for example, 
test if the div with id 
"LastMinuteOffer" it's present.
In this case, I dont concatenate calls, just do:

$t1=$html->select('div#LastMinuteOffer',0);
if ($t1){
//There are a last minute offfer...
}

But sometimes, I just want to get the text of a delimited node, so, in any 
cases, I concatenate several
calls in one, something like this:

$MovieTitle=$html->select('h3.title a.title',0)->getPlainText();

In this case, if the select fails, returns null, so... the getPlainText() fires 
the error:

Fatal error: Call to a member function getPlainText() on a non-object in ...

and the script fails.

This circunstance forces me to no concatenate nothing and test every thing, 
with nasty code like this:

$t1=$html->select('h3.title a.title',0)->getPlainText();
if (!$t1) {$TheError='Fail in Movie Title'; return false }
$MovieTitle=$t1->getPlainText();

I have done a new function to improve my code, perhaps any other guy is 
interested in:

select_imperative

With this function, I can concatenate all I want without danger of errors and I 
can catch the exception if any of the
selects fails.
I can do something like:

  try {
    $MovieTitle=$html->select_imperative('h3.title a.title',0)->getPlainText();
  } catch(Exception $e) {
    $TheError='Fail in Movie Title: '.$e->getMessage()."\n";
    return false; //Return with error
  }
  return true;   //Return All ok

Or can catch group all the errors in just one:

  try {
    $MovieTitle=$html->select_imperative('h3.title a.title',0)->getPlainText();
    $Author=$html->select_imperative('span.author',0)->getPlainText();
    $Date=$html->select_imperative('span.date',0)->getPlainText();
    $Format=$html->select_imperative('span.format',0)->getPlainText();

  } catch(Exception $e) {
    $TheError='Error scrapping Movie: '.$e->getMessage();
    return false; //Return with error
  }
  return true;   //Return All ok

With this I reduce my code huff.... a lot.

In the class HTML_Node:

  function select_imperative($query = '*', $index = false, $recursive = true, $check_self = false) {
    if ( ($rv=$this->select($query,$index,$recursive, $check_self)) == null){
      throw new Exception('Null query in select: '.$query);
    } else return $rv;
  }

and, in the class HTML_Parser:

  function select_imperative($query = '*', $index = false, $recursive = true, $check_self = false) {
        return $this->root->select_imperative($query, $index, $recursive, $check_self);
    }

Regards!

Original issue reported on code.google.com by Radika...@gmail.com on 21 Sep 2012 at 6:26