Closed tiborsimko closed 10 years ago
Originally on 2010-08-18
More info for clarification. While MARC standard seems to suggest 001 is mandatory, the MARCXML schema seems to allow not having one. Which is suitable for records not living as records, but as MARCXML snippets. So we have either to be prepared for those, or we should require passing fake record ID like 0.
In any case, EndNote and some other output formats currently ignore
MARCXML snippet that is being passed via xml_record
argument to
format_record()
, but seem to rely solely on getting the information
via recID, which cannot work for snippet-only records such as external
basket items. Example:
In [20]: z = "foo bar blah"
In [21]: format_record(1, 'xe', xml_record=z, on_the_fly=True)
Out[22]: '<abbr class="unapi-id" title="1"></abbr>\n<strong>ALEPH experiment: Candidate of Higgs boson production</strong> / <a href="http://pcuds33.cern.ch/search?ln=en&p=Photolab&f=author">Photolab</a>; 14 06 2000.<br /><small>Candidate for the associated production of the Higgs boson and Z boson. [...]</small><br /><small class="note"><a class="note" href="http://pcuds33.cern.ch/record/1/files/0106015_01.jpg">http://pcuds33.cern.ch/record/1/files/0106015_01.jpg</a></small><br /><small class="note"><a class="note" href="http://pcuds33.cern.ch/record/1/files/0106015_01.gif?subformat=icon">http://pcuds33.cern.ch/record/1/files/0106015_01.gif?subformat=icon</a></small>'
In [23]: format_record(2, 'xe', xml_record=z, on_the_fly=True)
Out[23]: '<abbr class="unapi-id" title="2"></abbr>\n<strong>The first CERN-built module of the barrel section of ATLAS\'s electromagnetic calorimeter</strong> / <a href="http://pcuds33.cern.ch/search?ln=en&p=Patrice+Lo%C3%AFez&f=author">Patrice Lo\xc3\xafez</a>; 10 Apr 2001.<br /><small>Behind the module, left to right Ralf Huber, Andreas Bies and Jorgen Beck Hansen. [...]</small><br /><small class="note"><a class="note" href="http://pcuds33.cern.ch/record/2/files/0104007_02.jpeg">http://pcuds33.cern.ch/record/2/files/0104007_02.jpeg</a></small><br /><small class="note"><a class="note" href="http://pcuds33.cern.ch/record/2/files/0104007_02.gif?subformat=icon">http://pcuds33.cern.ch/record/2/files/0104007_02.gif?subformat=icon</a></small>'
Originally on 2010-08-23
The limitation is mostly due to the support of extension functions in BibFormat XSL: http://invenio-demo.cern.ch/help/admin/bibformat-admin-guide#xslFormatTemplate
-_fn:modificationdate(recID)* fn:creation_date(recID) fn:eval_bibformat(recID, bibformat_template_code)
The first two functions need the recid to retrieve this information in an XSL context. The function could be extended to a) not fail if no recID is given and/or b) retrieve this information from baskets if possible (did they use to have something like negative "recid"?). Sometimes not having this information would anyway not make sense and would produce invalid output (eg. RSS output and its <pubDate>
tag)
The last function which lets run any BibFormat template/element in XSL templates is a bit more tricky to fix. Though the recid is just used to instantiate a BibFormatObject (bfo) which could very well be instantiated with an XML snippet too, it might be impossible to access the currently processed XML from the eval_bibformat(..) function. If not possible, one could add a new "marcxml" parameter to the function, which could be provided from the template itself: fn:eval_bibformat(recID, bibformat_template_code, marcxml)
<xsl:value-of select="fn:eval_bibformat(marc:controlfield[@tag='001'],'<BFE_SERVER_INFO var="recurl">',marc:.)" />
This might have some impact on speed though, and might not be possible in all cases.
Other alternatives, which can be combined:
Originally on 2010-08-23
Replying to [comment:2 jcaffaro]:
- whenever an XSL template is processed without a recid, do not process the above extension functions.
I think a simple and speedy solution of this kind may be sufficient for most use cases. ("do not process non-applicable elements")
Otherwise a more generic solution would be to extend eval_bibformat
to accept MARCXML snippet argument, as you proposed, but that would
not fully work for extracting non-MARC information anyway. (We would
have to introduce more FFT like elements.) I think we don't have to
go this way unless somebody has some concrete use cases at hand.
BTW, note that my foo bar blah
example in one of the above comments
shows that the recID
argument takes precedence over xml_record
argument, which goes counter the expected behaviour of
format_record()
as well as counter its docstring. This should be
investigated and fixed at the same time.
Originally by Jerome Caffaro jerome.caffaro@cern.ch on 2011-03-25
In [aa54dcd4aee1d0f04934cfbe8f9be06f0247ac8e]:
#CommitTicketReference repository="" revision="aa54dcd4aee1d0f04934cfbe8f9be06f0247ac8e"
BibFormat: fix XSLT formatting of MARCXML snippets
- Fix formatting of MARCXML given as parameter ("xml_record"
parameter, instead of records specified by ID with "recID"
parameter) when using XSL templates. (fixes #251)
- Improve docstrings.
Originally by Jerome Caffaro jerome.caffaro@cern.ch on 2012-02-15
In [66b3ff115e2f098b5c2c86d89439d3d0476c6d18]:
#CommitTicketReference repository="" revision="66b3ff115e2f098b5c2c86d89439d3d0476c6d18"
BibFormat: fix XSLT formatting of MARCXML snippets
- Fix formatting of MARCXML given as parameter ("xml_record"
parameter, instead of records specified by ID with "recID"
parameter) when using XSL templates. (fixes #251)
- Improve docstrings.
Originally by Jerome Caffaro jerome.caffaro@cern.ch on 2012-08-09
In 66b3ff115e2f098b5c2c86d89439d3d0476c6d18:
#CommitTicketReference repository="" revision="66b3ff115e2f098b5c2c86d89439d3d0476c6d18"
BibFormat: fix XSLT formatting of MARCXML snippets
- Fix formatting of MARCXML given as parameter ("xml_record"
parameter, instead of records specified by ID with "recID"
parameter) when using XSL templates. (fixes #251)
- Improve docstrings.
Originally by Jerome Caffaro jerome.caffaro@cern.ch on 2012-08-09
In 66b3ff115e2f098b5c2c86d89439d3d0476c6d18:
#CommitTicketReference repository="" revision="66b3ff115e2f098b5c2c86d89439d3d0476c6d18"
BibFormat: fix XSLT formatting of MARCXML snippets
- Fix formatting of MARCXML given as parameter ("xml_record"
parameter, instead of records specified by ID with "recID"
parameter) when using XSL templates. (fixes #251)
- Improve docstrings.
Originally on 2010-08-18
There are some problems when using
format_record()
on MARCXML snippets that do not have all expected fields and/or real record ID (tag 001).1) A small problem is that recID cannot really be
None
if MARCXML is passed, since it leads to tracebacks in statements like:We can live with this by passing fake recID, but we should probably document it in the docstring.
2) The real problem is that some output formats such as EndNote and RefWorks seem to assume presence of many fields, which is not the case for e.g. external items in baskets, that have only a handful of fields defined, and do not even have
001
.A simple test case that fails:
A test value that works:
but eliminate
001
from the snippet and it will stop working:The typical error is: