TLA-FLAT / FLAT

GNU General Public License v3.0
16 stars 4 forks source link

Using Saxon C PHP API #2

Open menzowindhouwer opened 8 years ago

menzowindhouwer commented 8 years ago

We ran a trial to switch the CMDI rendering to XSLT 2.0 using the Saxon C PHP API. The trial failed due to segmentation faults in the PHP interpreter. This issue contains the changes done just in case we want to repeat the trial with a future version of the API.

Dockerfile:

RUN apt-get update &&\
    apt-get -y dist-upgrade &&\
    apt-get -y install ... make gcc php5-dev gcj-jdk lib32z1 lib32ncurses5 lib32bz2-1.0
ENV JAVA_HOME /usr/lib/jvm/java-gcj

#
# Install Saxon C + PHP API
#
RUN mkdir -p /tmp/saxon &&\
    cd /tmp/saxon &&\
    wget "http://www.saxonica.com/saxon-c/libsaxon-HEC-setup64-v1.0.0.zip" &&\
    unzip libsaxon-HEC-setup64-v1.0.0.zip &&\
    ./libsaxon-HEC-setup64-v1.0.0 -dest /tmp/saxon/Saxon-HEC1.0.0 &&\
    cd ./Saxon-HEC1.0.0 &&\
    cp ./libsaxonhec.so /usr/lib/libsaxonhec.so &&\
    cp ./libsaxonhec.so /usr/lib/libsaxon.so &&\
    cp -r ./rt /usr/lib/rt &&\
    mkdir -p /usr/lib/rt/lib/jetjvm &&\
    ln -s $JAVA_HOME/jre/lib/amd64/server/libjvm.so /usr/lib/rt/lib/jetjvm/libjvm.so
ENV LD_LIBRARY_PATH=/usr/lib/rt/lib/amd64:/usr/lib/rt/lib/amd64/jetvm:$LD_LIBRARY_PATH
RUN cd /tmp/saxon/Saxon-HEC1.0.0 &&\
    cp -r saxon-data /usr/lib/saxon-data &&\
    echo "# JetVM env path (required for Saxon)" > /etc/ld.so.conf.d/jetvm.conf &&\
    echo "/usr/lib/rt/lib/amd64" >> /etc/ld.so.conf.d/jetvm.conf &&\
    echo "/usr/lib/rt/lib/amd64/jetvm" >> /etc/ld.so.conf.d/jetvm.conf &&\
    ldconfig &&\
    echo "export LD_LIBRARY_PATH=/usr/lib/rt/lib/amd64:/usr/lib/rt/lib/amd64/jetvm:$LD_LIBRARY_PATH" >> /etc/apache2/envvars &&\
    export CPLUS_INCLUDE_PATH=/usr/lib/rt/amd64:$JAVA_HOME/include/linux:$JAVA_HOME/include &&\
    cd ./Saxon.C.API &&\
    phpize &&\
    ./configure --enable-saxon &&\
    make &&\
    make install &&\
    sed -i 's|enable_dl = Off|enable_dl = On|g' /etc/php5/apache2/php.ini

The PHP extension be enabled by adding extension=saxon.so in /etc/php5/apache2/php.ini (and /etc/php5/cli/php.ini). Or by creating /etc/php5/mods-available/saxon.ini with

extension=saxon.so

Which can be enabled using php5enmod saxon. Using php -m will show if enabling the module worked (for CLI at least).

php -m also shows the current problem: it ends with a segmentation fault. Its unclear why. A gdb run or LD_DEBUG=all didn't reveal a clear cause. An actual PHP run showed that the module loads and saxon executes fine. But on the CLI the segfault happens when we don't use saxon functions, but not when we do use them. In Apache PHP it turns out that we always get a segfault (child pid 3245 exit signal Segmentation fault (11)). Still we had some hope that we could use dl("saxon") to only load the module when we needed it, but it turns out dl() is unavailable in Apache PHP.

The following changes were made to the CMDI solution pack to use Saxon:

islandora_solution_pack_cmdi/theme/theme.inc:

    $xslt_dom = new DOMDocument();
    $xslt_dom->load($file);
    if ($did_load_ok) {
        $saxon = extension_loaded("saxon");
        if (!$saxon)
            $saxon = dl("saxon");            
        if ($saxon) {
          $saxonProc = new Saxon\SaxonProcessor();
          $xsltProc = $saxonProc->newXsltProcessor();
          $xsltProc->setSourceFromXdmValue($saxonProc->parseXmlFromString($cmd));
          $xsltProc->compileFromFile($file);
          $variables['metadata'] = $xsltProc->transformToString();
          error_log("?DBG: used Saxon");
        } else {
          $xslt = new XSLTProcessor();
          $xslt->importStylesheet($xslt_dom);
          $variables['metadata'] = $xslt->transformToXml($input);
          error_log("?DBG: didn't use Saxon");
        }
    }

And islandora_solution_pack_cmdi/xsl/browser_cmdi2html.xsl was changed to use XSL 2.0.

Disabling theextension_loaded and dl and ignoring the segfaults showed that the rendering using XSL 2.0 actually works!

ddavis commented 7 years ago

My team is working on Islandora 7.x.1.x and on CLAW. In 7.x.1.x, using PHP 5.6.x on RHEL 6.x. I can run Saxon/C reliably (no issues with segfaults) but I am having a configuration issue. If I remove the xsl extension and install saxon, Saxon\SaxonProcessor() is found. However, with the xsl extension installed Saxon classes are not found. Without the xsl extension existing XSL code does not work. I am looking for a configuration what I can load saxon for a module (or drush) but not globally. Do you have any insight?

pautri commented 7 years ago

Hi Dan, I saw on the Islandora list that you actually managed to get it working, so we might want to give this another try as well. I'm copying the link below in case anybody else stumbles upon this.

https://groups.google.com/forum/#!topic/islandora/HEyDxyvy-Y8

menzowindhouwer commented 7 years ago

Hi, good to hear you got it going! I also had another try in the recent past with the new version of Saxon C (1.0.2), but I had to switch to a lower PHP version (5.6) than came by default (7.0) with the latest Ubuntu. So I would also be in interested in your Saxon C recompilation experiences!

Here is the relevant Dockerfile snippets (if only for bookkeeping):

apt-get -y --allow-unauthenticated install php7.0 php5.6 libapache2-mod-php5.6 php5.6-gd php5.6-pgsql php5.6-xsl php5.6-curl php5.6-dev`

and

RUN mkdir -p /tmp/saxon &&\
    cd /tmp/saxon &&\
    wget "http://www.saxonica.com/saxon-c/libsaxon-HEC-setup64-v1.0.2.zip" &&\
    unzip libsaxon-HEC-setup64-v1.0.2.zip &&\
    ./libsaxon-HEC-setup64-v1.0.2 -dest /tmp/saxon/Saxon-HEC1.0.2 &&\
    cd ./Saxon-HEC1.0.2 &&\
    cp ./libsaxonhec.so /usr/lib/libsaxonhec.so &&\
    cp ./libsaxonhec.so /usr/lib/libsaxon.so &&\
    cp -r ./rt /usr/lib/rt &&\
    mkdir -p /usr/lib/rt/lib/jetjvm &&\
    ln -s $JAVA_HOME/jre/lib/amd64/server/libjvm.so /usr/lib/rt/lib/jetjvm/libjvm.so
ENV LD_LIBRARY_PATH=/usr/lib/rt/lib/amd64:/usr/lib/rt/lib/amd64/jetvm:$LD_LIBRARY_PATH
RUN cd /tmp/saxon/Saxon-HEC1.0.2 &&\
    cp -r saxon-data /usr/lib/saxon-data &&\
    echo "# JetVM env path (required for Saxon)" > /etc/ld.so.conf.d/jetvm.conf &&\
    echo "/usr/lib/rt/lib/amd64" >> /etc/ld.so.conf.d/jetvm.conf &&\
    echo "/usr/lib/rt/lib/amd64/jetvm" >> /etc/ld.so.conf.d/jetvm.conf &&\
    ldconfig &&\
    echo "export LD_LIBRARY_PATH=/usr/lib/rt/lib/amd64:/usr/lib/rt/lib/amd64/jetvm:$LD_LIBRARY_PATH" >> /etc/apache2/envvars &&\
    export CPLUS_INCLUDE_PATH=/usr/lib/rt/amd64:$JAVA_HOME/include/linux:$JAVA_HOME/include &&\
    cd ./Saxon.C.API &&\
    phpize &&\
    ./configure --enable-saxon &&\
    make &&\
    make install
menzowindhouwer commented 6 years ago

https://dev.saxonica.com/repos/archive/opensource/latest9.8/hec/Dockerfile