adsabs / export_service

Export service to output ADS records with various formats including BibTex, AASTex, and multiple tagged and xml options
MIT License
3 stars 5 forks source link

replace html entities `<`, `>`, and `'&'` #173

Open golnazads opened 4 years ago

golnazads commented 4 years ago

@aaccomazzi this is done at the point of getting result from solr, before knowing what is the format. Could this approach cause any issue? You mentioned <SUP> and <SUB>, they are encoded only for latex.

aaccomazzi commented 4 years ago

The transformation of entities and markup such as <SUP> and the like should be controlled by the target output format. ADS Classic gets most (but not all) of this right, so please compare and contrast when in doubt.

Full discussion available here: https://github.com/adsabs/export_service/issues/172

golnazads commented 4 years ago

looked up classic for bibcode 2016ApJ...818L..26F, in the abstract we have ... From these composite SEDs we analyze the rest-frame UVJ colors, as well as the ratio of IR to UV light (IRX) and the UV slope (β) in the IRX-β dust relation at 1 < z < 3. ... which for xml has been encoded to ... From these composite SEDs we analyze the rest-frame UVJ colors, as well as the ratio of IR to UV light (IRX) and the UV slope (β) in the IRX-β dust relation at 1 &#60; z &#60; 3. ....

Hence html entities &lt;, &gt;, and '&amp;' for xml shall be encoded respectively to &#60;, &#62; and &#38;.