troydavisson / PHRETS

PHP client library for interacting with a RETS server to pull real estate listings, photos and other data made available from an MLS system
http://troda.com
MIT License
451 stars 234 forks source link

Parsing XML data that is STANDARD-XML, not COMPACT #99

Open brettalton opened 8 years ago

brettalton commented 8 years ago

Hi Troy,

I see this request has come up quite a bit since before 2.x was written and has come up for me and others as well in recent months, especially with CREA DDF.

I wanted to discuss this with you before I try and make some ad-hoc solution and offer that as a pull request, so I hope opening this as an issue is the correct route.

I took a look at src/Parsers/Search/OneX.php as well as src/Models/Search/Results.php and noticed it was there that you were assuming the tab delimited data to be encased in <COLUMNS> (accessed via $this->getColumnNames()) and <DATA> (accessed via $this->parseRecords()), after <COUNT> (accessed via $this->getTotalCount()) and <DELIMITER> (accessed via $this->getDelimiter()).

So you're assuming a COMPACT response such as,

<RETS ReplyCode="0" ReplyText="Operation successful">
    <COUNT Records="100"/>
    <DELIMITER value="09"/>
    <COLUMNS>...</COLUMNS>
    <DATA>...</DATA>
    <DATA>...</DATA>
    <DATA>...</DATA>
    ...
</RETS>

While STANDARD-XML returns like,

<RETS ReplyCode="0" ReplyText="Operation successful">
    <COUNT Records="100"/>
    <RETS-RESPONSE>
        <Pagination>
            <TotalRecords>100</TotalRecords>
            <Limit>100</Limit>
            <Offset>1</Offset>
            <TotalPages>1</TotalPages>
            <RecordsReturned>100</RecordsReturned>
        </Pagination>
        <PropertyDetails ID="11937342" LastUpdated="Thu, 14 Jun 2012 17:05:47 GMT">
        ...
        </PropertyDetails>
        <PropertyDetails ID="11937343" LastUpdated="Thu, 14 Jun 2012 17:05:47 GMT">
        ...
        </PropertyDetails>
        ...
    </RETS-RESPONSE>
</RETS>

The data inside <PropertyDetails> is inherently complex and nested, using data and attributes all over the place. So we can't build a parser that knows all this information. Especially because STANDARD-XML only returns data that exists, aka it doesn't return entities in <COLUMNS> if the corresponding data in <DATA> is empty, unlike COMPACT. This is bad.

Furthermore, this response is only for PROPERTY/PROPERTY (referring to SearchType/Class) and for a query specifically asking for 100 specific properties based on the ID. For PROPERTY/PROPERTY where ID=* (also called a Master List in CREA DDF), it returns <Property></Property> not <PropertyDetails>. The same goes for DESTINATION/DESTINATION, OFFICE/OFFICE and AGENT/AGENT and any other endpoint this RETS server or any other RETS server may have.

So I'm wondering if we could do this...

Could we, for STANDARD-XML responses, simply return the <RETS-RESPONSE> as an array, opening up and return the data to the programmer as a data block, rather than parsing the records via PHRETS? For instance, on the existence of $xml->{'RETS-RESPONSE'}, use a different parser, one that doesn't count each record but returns $dataset = $xml->{'RETS-RESPONSE'} and allows the programmer to do what they need to with the data?

I know that seems half baked, but at least data could be returned to the programmer, rather than failing with,

[2016-02-03 01:56:41] PHRETS.DEBUG: 51 total results found [] []
[2016-02-03 01:56:41] PHRETS.DEBUG: 0 results given [] []

because the parser is failing and can't read anything that isn't in <COLUMNS> & <DATA>.

We could return the data as an array without parsing. I think that's a great workaround for now...

troydavisson commented 8 years ago

In general, this idea seems possible. 2.x was written with what I've called Strategies so that different parsers could be swapped in as needed (even custom ones you might write if you need to tweak something for a non-compliant RETS server). In the case of CREA, supporting only STANDARD-XML is obviously an issue with what way PHRETS currently expects responses to come back.

I'll try out an idea for this and update this issue so you can give it a try. Like you said, the nested XML format makes it tough for a one-size-fits-all object type, but at least returning something you can use manually shouldn't be too much trouble.

edenhollander commented 8 years ago

I would also be interested in this functionality. I currently hacked the version 1.x implementation of PHRETS to return $xml->{'RETS-RESPONSE'} so I can parse the SimpleXMLElement object manually and would use version 2.x if this feature is available. Like brettalton, I'm using data from CREA.

thewebexpert commented 8 years ago

+1 for me too, would love to be able to use this functionality as well

briancaicco commented 7 years ago

+1 for me as well!

ghost commented 7 years ago

Since the concept was raised a year ago, what is the status on developing a solution based on the suggestion? Regards, Richard Tomkins

brettalton commented 7 years ago

@troydavisson, can you document how CREA might have a Strategy written for it, as a non-compliant RETS server? if it's started in a new branch, maybe a couple of us can work on filling out the rest. CREA, as much as it is non-compliant, is used heavily in the Canadian real estate market, so it would be highly beneficial to have this Strategy. Even if it was auto-parsed and returned as a PHP object, it might be more useful than 0 results given

simmonspaul commented 6 years ago

I was treading water with this for sometime. Using v2+ for the first time and no results based on the setup example. Found that the [effectiveUrl:GuzzleHttp\Message\Response:private] was correct with no results provided and came across this repository. For now in OneX.php, I captured the $response->xml() in the parse function to get at the raw data. Where would the decision of strategy occur? I could see interrogating an $optional_parameter within the search function as the hook in determining the relevant grab variable (strategy) to apply. How are the strategies intended to be applied? Thanks

simmonspaul commented 6 years ago

I've tried to improve my hack to embrace the idea of strategies.

Below is a standard search: $rets->Search($resource, $class, $query, [ ...$optional_parameters... ], $recursive]

The $recursive parameter is a boolean and if it is not supplied is defaulted as false in the public function "Search" in Session.php.

Within "Search" there is an IF/ELSE statement that "grab"s the parser that should be used (recursive or not eg. $parser = $this->grab('parser.search.recursive'); ) to return output.

These parsers are defined in StandardStrategy.php.


My improved hack is to make the $recursive function accept non boolean values and to use this to pass the parser strategy to be used when necessary.

To achieve this I amend the 'search' function slightly:

Declare the function with a default recursive of 'false' (to try to be sympathetic to previous use): public function Search($resource_id, $class_id, $dmql_query, $optional_parameters = [], $recursive = false) ----> public function Search($resource_id, $class_id, $dmql_query, $optional_parameters = [], $recursive = 'false')

Then modify the IF/ELSE statement (providing some backwards compatibility)

    if (($recursive === 'true') or ($recursive === 1)) {
        $parser = $this->grab('parser.search.recursive');
    } 
    elseif (($recursive === 'false') || ($recursive === 0)) {
        $parser = $this->grab('parser.search');
    }
    else {
        $parser = $this->grab($recursive);
    }
    return $parser->parse($this, $response, $parameters);

Therefore if developers want to setup custom parsers, they can define them in StandardStrategy.php and pass them using the recursive variable.

Hope this makes sense.

Paul-DS commented 6 years ago

I implemented a quick solution for the phRETS v2 using a custom strategy/parser

My code:

CreaResults.php

use PHRETS\Models\Search\Results;

class CreaResults extends Results
{
    public function addCreaRecord($record) {
        $this->results->push($record);
    }
}

CreaParser.php

use PHRETS\Configuration;
use PHRETS\Http\Response;
use PHRETS\Session;
use PHRETS\Strategies\Strategy;

class CreaParser
{
    public function parse(Session $rets, Response $response, $parameters)
    {
        $parser = $rets->getConfiguration()->getStrategy()->provide(Strategy::PARSER_XML);
        $xml = $parser->parse($response);

        $data = $xml->{'RETS-RESPONSE'};

        $rs = new CreaResults;

        if (isset($data->Pagination) && isset($data->Pagination->TotalRecords)) {
            $rs->setTotalResultsCount(intval($data->Pagination->TotalRecords));

            if (isset($data->Pagination->Offset) && isset($data->Pagination->Limit) && $data->Pagination->Offset + $data->Pagination->Limit >= $data->Pagination->TotalRecords) {
                $rs->setMaxRowsReached();
                $rets->debug('Maximum rows returned in response');
            }
        }

        if (isset($data->PropertyDetails)) {
            foreach ($data->PropertyDetails as $property) {
                $rs->addCreaRecord($property);
            };
        }

        unset($xml);

        return $rs;
    }
}

CreaStrategy.php

use PHRETS\Configuration;
use PHRETS\Strategies\Strategy;

class CreaStrategy implements Strategy
{
    /**
     * Default components
     *
     * @var array
     */
    protected $default_components = [
        Strategy::PARSER_LOGIN => \PHRETS\Parsers\Login\OneFive::class,
        Strategy::PARSER_OBJECT_SINGLE => \PHRETS\Parsers\GetObject\Single::class,
        Strategy::PARSER_OBJECT_MULTIPLE => \PHRETS\Parsers\GetObject\Multiple::class,
        Strategy::PARSER_SEARCH => CreaParser::class,
        Strategy::PARSER_SEARCH_RECURSIVE => \PHRETS\Parsers\Search\RecursiveOneX::class,
        Strategy::PARSER_METADATA_SYSTEM => \PHRETS\Parsers\GetMetadata\System::class,
        Strategy::PARSER_METADATA_RESOURCE => \PHRETS\Parsers\GetMetadata\Resource::class,
        Strategy::PARSER_METADATA_CLASS => \PHRETS\Parsers\GetMetadata\ResourceClass::class,
        Strategy::PARSER_METADATA_TABLE => \PHRETS\Parsers\GetMetadata\Table::class,
        Strategy::PARSER_METADATA_OBJECT => \PHRETS\Parsers\GetMetadata\BaseObject::class,
        Strategy::PARSER_METADATA_LOOKUPTYPE => \PHRETS\Parsers\GetMetadata\LookupType::class,
        Strategy::PARSER_XML => \PHRETS\Parsers\XML::class,
    ];

    /**
     * @var \Illuminate\Container\Container
     */
    protected $container;

    /**
     * @param $component
     * @return mixed
     */
    public function provide($component)
    {
        return $this->container->make($component);
    }

    /**
     * @param Configuration $configuration
     * @return void
     */
    public function initialize(Configuration $configuration)
    {
        // start up the service locator
        $this->container = new Container;

        foreach ($this->default_components as $k => $v) {
            if ($k == 'parser.login' and $configuration->getRetsVersion()->isAtLeast1_8()) {
                $v = \PHRETS\Parsers\Login\OneEight::class;
            }

            $this->container->singleton($k, function () use ($v) { return new $v; });
        }
    }

    /**
     * @return Container
     */
    public function getContainer()
    {
        return $this->container;
    }
}

Then when creating your RETS configuration, just use the custom strategy

$config = new \PHRETS\Configuration;
$config->setLoginUrl(...)
        ->setUsername(...)
        ->setPassword(...);

$config->setStrategy(new CreaStrategy());

$rets = new \PHRETS\Session($config);
$rets->Login();

$results = $rets->Search(...);

foreach ($results as $result) {
    // $result is a SimpleXMLElement representing the "PropertyDetails" object
}
simmonspaul commented 6 years ago

Nice job. Thank you. Much cleaner. I'll do a deep dive when I'm next looking at that code. Thanks.

steveheinsch commented 6 years ago

@troydavisson It would be kind of cool if there was a separate phrets-strategies repo where we, the community, could add and pull in custom packages (parsers) that we need via composer. Then just set which we need to use in the config (which we can already do). That way we don't all individually have to come up with our own solutions for RETS servers that aren't using the standards and keep it separate from the main phrets repo, which should be just for compliant servers.

simmonspaul commented 5 years ago

I implemented a quick solution for the phRETS v2 using a custom strategy/parser

My code:

@steveheinsch @troydavisson Agreed with both of you. Having a design on how this should be implemented would be great. Would prefer to have a way of adding these strategies/parsers without messing with the code base. Having looked at this a couple of times, I think both the strategy and parser should be independent parameters passed directly from the search. ie. Drop/install the strategies/parsers that you want to use into the corresponding folders and when specified from the search it should work.

Paul DS - Thanks again for your code. This demonstrated to me how to specify a strategy and have it work end to end.

Therefore we would need to add additional "strategy/parser sets" for office, agent, and then by the master and detail records. Again hence, I think the core strategy could stay the same if we could specify the parsers independently.

Thanks again all.

aconital commented 3 years ago

We are also trying to use PHPRets for CREA DDF and I'm curious if there is an update on this?

jeffkee commented 1 year ago

Paul-DS - thank you, life saver. I managed to patch on your code to create another parse strategy.

Note the returned object is a SimpleXML object. My existing code wanted a multi-dimensional array similar to CREA's modified PHRETS V1.02 code.

What I did is use json_encode() then json_decode(). Worked like a charm!

$results = $rets->Search("Property".. the norm, with STANDARD-XML);
$listings_xml = $results->getIterator();
$listings =  json_decode(json_encode($listings_xml), true); // usable array
$numlistings = count($listings);