ropensci / xslt

Extension of xml2 package for xsl transformations
https://docs.ropensci.org/xslt
28 stars 1 forks source link

calling xml_xslt causes R to close #1

Closed pmorrisNSF closed 7 years ago

pmorrisNSF commented 7 years ago

Dear Jeroen and the ropensci team,

Much thanks for developing the xslt library, and it installs well in R 3.4.0 and the example works as expected. However when I try and use one of my own version 1.0 xslt file (which processes correctly in Saxon) with xml_xslt it causes my R session to close.

Apologies if I am doing something wrong, but just wondered if there was anything in my xslt (shown below) which would cause xslt to close my r session?

Much thanks again,

Paul Morris, National Science Foundation

<?xml version="1.0" encoding="UTF-8" ?>
<xsl:stylesheet
    xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
    version="1.0">

  <xsl:strip-space elements="*" />

  <xsl:output encoding="UTF-8" method="text" />

  <xsl:key name="labelById" match="labels/list/label" use="@id" />

  <xsl:template match="/">Label,Proposal_ID,Submission_Date
<xsl:apply-templates select="result/documents/list" />
  </xsl:template>

  <xsl:template match="documents/list/document">"<xsl:apply-templates select="labels" />",<xsl:value-of select="content/field[@name='Division']" />,<xsl:value-of select="content/field[@name='Proposal_Title']" /><xsl:text>
</xsl:text>
  </xsl:template>

</xsl:stylesheet>
jeroen commented 7 years ago

Can you include an example of an xml document that you would apply this stylesheet to, so that we can reproduce the error / crash?

pmorrisNSF commented 7 years ago

Dear Jeroen,

Many thanks for getting back to us. The original XML had some confidential information in it, so I'm sending the same xml with the sensitive info replaced. Hopefully you can pick up example.xml and list.xslt from the enclosed Drop Box links.

The xslt works as expected with the xml file in Saxon and MSXSL, but the R commands below crash R 3.4.0 (both 32 bit and 64 bit).

library(xslt) 
doc <- read_xml('https://www.dropbox.com/s/t5ii8fhedk7i3xp/example.xml?dl=1')
style <- read_xml('https://www.dropbox.com/s/jpq8698649fqpge/list.xsl?dl=1')
out <- xml_xslt(doc, style)

Thanks again for your development work with xslt in R, it will be hugely useful and we hope the supplied information is interesting. Please let us know if you need any more from us.

Kind regards,

Paul Morris Office of the Director/Data Group National Science Foundation

pmorrisNSF commented 7 years ago

Dear Jeroen,

Just to check that that these links I provided worked ok?

With thanks

Paul Paul Morris Office of the Director/Data Group National Science Foundation

jeroen commented 7 years ago

So the problem is simply that our xslt package assuming the result will be another xml document. However your stylesheet generates non-xml text based data, which is a bit confusing.

I am looking into the libxslt documentation to see how to properly detect the output format.

For now, as a workaround, you can simply use as.character() to get to the text output. That should give you the proper output, however with an additional line at the top, which you can ignore.

library(xslt) 
doc <- read_xml('https://www.dropbox.com/s/t5ii8fhedk7i3xp/example.xml?dl=1')
style <- read_xml('https://www.dropbox.com/s/jpq8698649fqpge/list.xsl?dl=1')
out <- xml_xslt(doc, style)
as.character(out)

I'll try to properly fix this soon.

pmorrisNSF commented 7 years ago

Thanks Jeroen for the investigation and possible workaround. I still get the Memory allocation failed error when I type the above into R 3.4.0? See the output below. Apologies if I have missed something here. Paul

library(xslt) Loading required package: xml2 doc <- read_xml('https://www.dropbox.com/s/t5ii8fhedk7i3xp/example.xml?dl=1') style <- read_xml('https://www.dropbox.com/s/jpq8698649fqpge/list.xsl?dl=1') out <- xml_xslt(doc, style) Error in doc_xslt_apply(doc$doc, stylesheet$doc) : Memory allocation failed [2] as.character(out) [1] "-0.182574182748795" "1" "-1" "5.47722578048706" "7.66666650772095"

jeroen commented 7 years ago

OK I fixed the problem (it was a windows specific issue). I have released xslt 1.1 to CRAN. Try to reinstall the package in a clean new R session (i.e the xslt package should not be loaded):

install.packages("xslt", type = "source")

And then test your example:

library(xslt) 
doc <- read_xml('https://www.dropbox.com/s/t5ii8fhedk7i3xp/example.xml?dl=1')
style <- read_xml('https://www.dropbox.com/s/jpq8698649fqpge/list.xsl?dl=1')
out <- xml_xslt(doc, style)
cat(out)

It should now return text instead of an xml document. Thanks for reporting this problem!

pmorrisNSF commented 7 years ago

Hi Jeroen,

Thanks for investigating. I am still seeing the same "memory allocation failed" error on two separate Windows machines running R 3.4.0 (32 bit) with just devtools installed and Rtools34.exe (for building packages on Windows).

Below is the console output from last night when I tried to replicate your command sequence. I'll try on a Mac or Linux OS, but all machines at NSF are running Windows 7 and Windows 10 (the error is the same on both Win versions). I can't see any errors in the compiling of ropensci/xslt.

Apologies for the extra work,

Paul

devtools::install_github("ropensci/xslt", force=TRUE) Downloading GitHub repo ropensci/xslt@master from URL https://api.github.com/repos/ropensci/xslt/zipball/master Installing xslt "C:/R/R-3.4.0/bin/i386/R" --no-site-file --no-environ --no-save --no-restore --quiet CMD INSTALL \ "C:/Users/pmorris/AppData/Local/Temp/1/RtmpsTacLb/devtools1ad8137f607f/ropensci-xslt-6841f63" --library="C:/R/R-3.4.0/library" --install-tests

*** arch - i386 rm -f RcppExports.o xslt.o xslt_init.o xml2.dll "C:/R/R-34~1.0/bin/i386/Rscript.exe" "../tools/winlibs.R" c:/Rtools/mingw_32/bin/g++ -I"C:/R/R-34~1.0/include" -DNDEBUG -I../windows/libxml2-2.9.4/include/libxml2 -I../windows/libxml2-2.9.4/include -DLIBXML_STATIC -DSTRICT_R_HEADERS -I"C:/R/R-3.4.0/library/Rcpp/include" -I"C:/R/R-3.4.0/library/xml2/include" -I"d:/Compiler/gcc-4.9.3/local330/include" -O2 -Wall -mtune=core2 -c RcppExports.cpp -o RcppExports.o c: c:/Rtools/mingw_32/bin/g++ -I"C:/R/R-34~1.0/include" -DNDEBUG -I../windows/libxml2-2.9.4/include/libxml2 -I../windows/libxml2-2.9.4/include -DLIBXML_STATIC -DSTRICT_R_HEADERS -I"C:/R/R-3.4.0/library/Rcpp/include" -I"C:/R/R-3.4.0/library/xml2/include" -I"d:/Compiler/gcc-4.9.3/local330/include" -O2 -Wall -mtune=core2 -c xslt_init.cpp -o xslt_init.o c:/Rtools/mingw_32/bin/g++ -shared -s -static-libgcc -o xslt.dll tmp.def RcppExports.o xslt.o xslt_init.o -L../windows/libxml2-2.9.4/lib/i386 -lexslt -lxslt -lxml2 -llzma -liconv -lz -lws2_32 -Ld:/Compiler/gcc-4.9.3/local330/lib/i386 -Ld:/Compiler/gcc-4.9.3/local330/lib -LC:/R/R-34~1.0/bin/i386 -lR installing to C:/R/R-3.4.0/library/xslt/libs/i386

* arch - x64 rm -f RcppExports.o xslt.o xslt_init.o xml2.dll "C:/R/R-34~1.0/bin/x64/Rscript.exe" "../tools/winlibs.R" c:/Rtools/mingw_64/bin/g++ -I"C:/R/R-34~1.0/include" -DNDEBUG -I../windows/libxml2-2.9.4/include/libxml2 -I../windows/libxml2-2.9.4/include -DLIBXML_STATIC -DSTRICT_R_HEADERS -I"C:/R/R-3.4.0/library/Rcpp/include" -I"C:/R/R-3.4.0/library/xml2/include" -I"d:/Compiler/gcc-4.9.3/local330/include" -O2 -Wall -mtune=core2 -c RcppExports.cpp -o RcppExports.o c:/Rtools/mingw_64/bin/g++ -I"C:/R/R-34~1.0/include" -DNDEBUG -I../windows/libxml2-2.9.4/include/libxml2 -I../windows/libxml2-2.9.4/include -DLIBXML_STATIC -DSTRICT_R_HEADERS -I"C:/R/R-3.4.0/library/Rcpp/include" -I"C:/R/R-3.4.0/library/xml2/include" -I"d:/Compiler/gcc-4.9.3/local330/include" -O2 -Wall -mtune=core2 -c xslt.cpp -o xslt.o c:/Rtools/mingw_64/bin/g++ -I"C:/R/R-34~1.0/include" -DNDEBUG -I../windows/libxml2-2.9.4/include/libxml2 -I../windows/libxml2-2.9.4/include -DLIBXML_STATIC -DSTRICT_R_HEADERS -I"C:/R/R-3.4.0/library/Rcpp/include" -I"C:/R/R-3.4.0/library/xml2/include" -I"d:/Compiler/gcc-4.9.3/local330/include" -O2 -Wall -mtune=core2 -c xslt_init.cpp -o xslt_init.o c:/Rtools/mingw_64/bin/g++ -shared -s -static-libgcc -o xslt.dll tmp.def RcppExports.o xslt.o xslt_init.o -L../windows/libxml2-2.9.4/lib/x64 -lexslt -lxslt -lxml2 -llzma -liconv -lz -lws2_32 -Ld:/Compiler/gcc-4.9.3/local330/lib/x64 -Ld:/Compiler/gcc-4.9.3/local330/lib -LC:/R/R-34~1.0/bin/x64 -lR installing to C:/R/R-3.4.0/library/xslt/libs/x64 R inst tests preparing package for lazy loading * help installing help indices building package indices testing if installed package can be loaded arch - i386 arch - x64

out <- xml_xslt(doc, style)

Error in doc_xslt_apply(doc$doc, stylesheet$doc) : Memory allocation failed [2]

jeroen commented 7 years ago

I fixed some issues earlier today. Does this happen with xslt 1.1? Could you show me your sessionInfo()?

pmorrisNSF commented 7 years ago

Hi Jeroen,

The errors above were from the test I did late last night, but now everything works with xslt 1.1 (see below), fantastic!

very impressed, and lots of happy users here at NSF. Thank you for trouble-shooting.

Paul

doc <- read_xml('https://www.dropbox.com/s/t5ii8fhedk7i3xp/example.xml?dl=1') Error in read_xml("https://www.dropbox.com/s/t5ii8fhedk7i3xp/example.xml?dl=1") : could not find function "read_xml" library(xslt) Loading required package: xml2 doc <- read_xml('https://www.dropbox.com/s/t5ii8fhedk7i3xp/example.xml?dl=1') style <- read_xml('https://www.dropbox.com/s/jpq8698649fqpge/list.xsl?dl=1') out <- xml_xslt(doc, style) out [1] "Label,Proposal_ID,Directorate,Division,Submission_Date\n\"computational models (2)\",1111111,MPS,DMS,20161004\n\"water quality (20), career opportunities (10), student interest (6), internship opportunities (6), paid internships (5), student recruitment (4), student mentoring (4), liberal arts (4), faculty mentor (4), critical thinking (4), summer workshop (3), graduation rates (3), application process (3), undergraduate degree (2), test hypotheses (2), supervise student (2), summative evaluation (2), student enrollment (2), science teachers (2), research mentor (2)\",1111111,GEO,EAR,20161010\n\"heavy metals (10), student mentoring (4), drinking water (4), transmission electron (3), room temperature (3), new materials (3), transmission electron microscopy (2), supporting letter (2), solid state (2), graduate student mentors (2), electron microscopy (2), aqueous solution (2)\",2222222,MPS,CHE,20161003\n\"steering committee (8), bachelors degree (5), two-year colleges (4), faculty m...

jeroen commented 7 years ago

OK thanks for your patience :)

jeroen commented 7 years ago

The package has been published on CRAN so in one or two days you should just be able to get this via install.packages("xslt").

pmorrisNSF commented 7 years ago

Excellent job. We were having to make xslt transformations via an external call to Saxon or msxsl, so this is great.

It will be put to immediate use, thanks again.

Paul