rahuldave / appsem

js server and ui for semflow
4 stars 1 forks source link

Abstract display is cut short due to unescaped < #12

Open DougBurke opened 13 years ago

DougBurke commented 13 years ago

I noted that 2007ApJ...662..145H is showing an abstract of:

Abstract: Using Chandra and HST archival data, we have studied the individual SED of 11 quasars at redshifts 0.3

Given that it loses information right at the point the abstract contains a < character, it looks like HTML escape issues. I've checked that reuters.theme.js is getting the full string, so it's not a data ingest problem but a display one. My simple attempts to resolve this have failed.

Note that this particular paper, which has Chandra data, doesn't appear in semantic2 on labs but does in my local install.

DougBurke commented 13 years ago

Note that < in other abstracts displays correctly - e.g. 2006ApJS..164..173M - so it's not obviously clear what the problem is.

DougBurke commented 13 years ago

Here's what's in the RDF store for the two abstracts:

>>> res = sesame.makeQuery('SELECT ?s ?p { ?s ?p "2007ApJ...662..145H". }')
>>> print(res)
[{'p': {'type': 'uri', 'value': 'https://github.com/rahuldave/ontoads/raw/master/owl/ADS-bibo.owl#workIdentifier'}, 's': {'type': 'uri', 'value': 'http://ads.harvard.edu/sem/bib#0bd01d1b-ef83-41e7-b07c-c7721d137834'}}, {'p': {'type': 'uri', 'value': 'https://github.com/rahuldave/ontoads/raw/master/owl/ADS-bibo.owl#bibcode'}, 's': {'type': 'bnode', 'value': 'node164vsa9ibx26166'}}, {'p': {'type': 'uri', 'value': 'https://github.com/rahuldave/ontoads/raw/master/owl/ADS-bibo.owl#identifier'}, 's': {'type': 'bnode', 'value': 'node164vsa9ibx265905'}}, {'p': {'type': 'uri', 'value': 'https://github.com/rahuldave/ontoads/raw/master/owl/ADS-bibo.owl#identifier'}, 's': {'type': 'bnode', 'value': 'node164vsa9ibx265987'}}]
>>> res2 = sesame.makeQuery('SELECT ?p ?t { <http://ads.harvard.edu/sem/bib#0bd01d1b-ef83-41e7-b07c-c7721d137834> ?p [ <https://github.com/rahuldave/ontoads/raw/master/owl/ADS-bibo.owl#abstractText> ?t ] . }')
>>> res2[0]
{'p': {'type': 'uri', 'value': 'https://github.com/rahuldave/ontoads/raw/master/owl/ADS-bibo.owl#hasAbstract'}, 't': {'type': 'literal', 'value': u"Using Chandra and HST archival data, we have studied the individual SED of 11 quasars at redshifts 0.3<z<1.8. All UV spectra show a spectral break around 1100 \xc5. Five X-ray spectra showed the presence of a ``soft excess,'' and seven spectra showed an intrinsic absorption. We found that for most quasars a simple extrapolation of the far-UV power law into the X-ray domain generally lies below the X-ray data and that the big blue bump and the soft X-ray excess do not share a common physical origin. We explore the issue of whether the observed SED might be dust absorbed in the far- and near-UV. We fit the UV break, assuming a power law that is absorbed by cubic nanodiamond dust grains. We then explore the possibility of a universal SED (with a unique spectral index) by including further absorption from SMC-like extinction. Using this approach, satisfactory fits to the spectra can be obtained. The hydrogen column densities required by either nanodiamonds or amorphous dust models are all consistent, except for one object, with the columns deduced by our X-ray analysis, provided that the C depletion is ~0.6. Because dust absorption implies a flux recovery in the EUV (<700 \xc5), our modeling opens the possibility that the intrinsic quasar SED is much harder and more luminous in the EUV than inferred from the near-UV data, as required by photoionization models of the broad emission line region. We conclude that the intrinsic UV SED must undergo a sharp turnover before the X-ray domain."}}

and

>>> res = sesame.makeQuery('SELECT ?s ?p { ?s ?p "2006ApJS..164..173M" . }')
>>> res
[{'p': {'type': 'uri', 'value': 'https://github.com/rahuldave/ontoads/raw/master/owl/ADS-bibo.owl#workIdentifier'}, 's': {'type': 'uri', 'value': 'http://ads.harvard.edu/sem/bib#6fe04510-fc91-4a2e-8c99-1a46e814874c'}}, {'p': {'type': 'uri', 'value': 'https://github.com/rahuldave/ontoads/raw/master/owl/ADS-bibo.owl#bibcode'}, 's': {'type': 'bnode', 'value': 'node164vsa9ibx62979'}}, {'p': {'type': 'uri', 'value': 'https://github.com/rahuldave/ontoads/raw/master/owl/ADS-bibo.owl#identifier'}, 's': {'type': 'bnode', 'value': 'node164vsa9ibx63275'}}]
>>> res2 = sesame.makeQuery('SELECT ?t { <http://ads.harvard.edu/sem/bib#6fe04510-fc91-4a2e-8c99-1a46e814874c> ?p [ <https://github.com/rahuldave/ontoads/raw/master/owl/ADS-bibo.owl#abstractText> ?t ] . }')
>>> res2
[{'t': {'type': 'literal', 'value': "Dynamo activity in stars of different types is expected to generate magnetic fields with different characteristics. As a result, a differential study of the characteristics of magnetic loops in a broad sample of stars may yield information about dynamo systematics. In the absence of direct imaging, certain physical parameters of a stellar magnetic loop can be extracted if a flare occurs in that loop. In this paper we employ a simple nonhydrodynamic approach introduced by Haisch, to analyze a homogeneous sample of all of the flares we could identify in the EUVE DS database: a total of 134 flares that occurred on 44 stars ranging in spectral type from F to M and in luminosity class from V to III. All of the flare light curves that have been used in the present study were obtained by a single instrument (EUVE DS). For each flare, we have applied Haisch's simplified approach (HSA) in order to determine loop length, temperature, electron density, and magnetic field. For each of our target stars, a literature survey has been performed to determine quantitatively the extent to which our results are consistent with independent studies. The results obtained by HSA are found to be well supported by results obtained by other methods. Our survey suggests that, on the main sequence, short loops (with lengths <=0.5R<SUB>*</SUB>) may be found in stars of all classes, while the largest loops (with lengths up to 2R<SUB>*</SUB>) appear to be confined to M dwarfs. Based on EUVE data, the transition from small to large loops on the main sequence appears to occur between spectral types K2 and M0. We discuss the implications of this result for dynamo theories."}}]
DougBurke commented 13 years ago

Here's another interesting example, the abstract of 2003ApJS..146..165S becomes bold part-way through.

>>> res = sesame.makeQuery('SELECT ?s { ?s adsbib:workIdentifier "2003ApJS..146..165S" . }')
>>> uri = res[0]['s']['value']
>>> uri
'http://ads.harvard.edu/sem/bib#00291a27-4dda-481a-9864-337936e6357c'
>>> res2 = sesame.makeQuery('SELECT ?o { <http://ads.harvard.edu/sem/bib#00291a27-4dda-481a-9864-337936e6357c> adsbib:hasAbstract [ adsbib:abstractText ?o ] . }')
>>> res2[0]['o']['value']
u'We report the results of a FUSE study of high-velocity O VI absorption along complete sight lines through the Galactic halo in directions toward 100 extragalactic objects and two halo stars. The high-velocity O VI traces a variety of phenomena, including tidal interactions with the Magellanic Clouds, accretion of gas, outflowing material from the Galactic disk, warm/hot gas interactions in a highly extended Galactic corona, and intergalactic gas in the Local Group. We identify 84 high-velocity O VI features at >=3 \u03c3 confidence at velocities of -500<v<SUB>LSR</SUB><+500 km s<SUP>-1</SUP>. The 84 O VI features have velocity centroids ranging from -372<~v<SUB>LSR</SUB><~-90 km s<SUP>-1</SUP> to +93<~v<SUB>LSR</SUB><~+385 km s<SUP>-1</SUP>, line widths b~16-72 km s<SUP>-1</SUP> with an average of <b>=40+/-13 km s<SUP>-1</SUP>, and an average O VI column density <logN>=13.95+/-0.34 with a median value of 13.97. Values of b greater than the 17.6 km s<SUP>-1</SUP> thermal width expected for O VI at T~3\xd710<SUP>5</SUP> K indicate that additional nonthermal broadening mechanisms are common. The O VI \u03bb1031.926 absorption is detected at >=3 \u03c3 confidence along 59 of the 102 sight lines surveyed. The high-velocity O VI detections indicate that ~60% of the sky (and perhaps as much as ~85%, depending on data quality considerations) is covered by high-velocity H<SUP>+</SUP> associated with the O VI. We find that N(H<SUP>+</SUP>)>~10<SUP>18</SUP> cm<SUP>-2</SUP> if the high-velocity hot gas has a metallicity similar to that of the Magellanic Stream; this detection rate is considerably higher than that of high-velocity warm H I traced through its 21 cm emission at a comparable column density level. Some of the high-velocity O VI is associated with known H I structures (the Magellanic Stream, Complex A, Complex C, the Outer Spiral Arm, and several discrete H I HVCs). Some of the high-velocity O VI features have no counterpart in H I 21 cm emission, including discrete absorption features and positive velocity absorption wings extending from ~100 to ~300 km s<SUP>-1</SUP> that blend with lower velocity absorption produced by the Galactic thick disk/halo. The discrete features may typify clouds located in the Local Group, while the O VI absorption wings may be tidal debris or material expelled from the Galactic disk. Most of the O VI features have velocities incompatible with those of the Galactic halo, even if the halo has decoupled from the underlying Galactic disk. The reduction in the dispersion about the mean of the high-velocity O VI centroids when the velocities are converted from the LSR to the GSR and LGSR reference frames is necessary (but not conclusive) evidence that some of the clouds are located outside the Galaxy. Most of the O VI cannot be produced by photoionization, even if the gas is irradiated by extragalactic ultraviolet background radiation. Several observational quantities indicate that collisions in hot gas are the primary ionization mechanism responsible for the production of the O VI. These include the ratios of O VI column densities to those of other highly ionized species (C IV, N V) and the strong correlation between N(O VI) and O VI line width. Consideration of the possible sources of collisional ionization favors production of some of the O VI at the boundaries between cool/warm clouds of gas and a highly extended (R>~70 kpc), hot (T>10<SUP>6</SUP> K), low-density (n<~10<SUP>-4</SUP>-10<SUP>-5</SUP> cm<SUP>-3</SUP>) Galactic corona or Local Group medium. The existence of a hot, highly extended Galactic corona or Local Group medium and the prevalence of high-velocity O VI are consistent with predictions of current galaxy formation scenarios. Distinguishing between the various phenomena producing high-velocity O VI in and near the Galaxy will require continuing studies of the distances, kinematics, elemental abundances, and physical states of the different types of high-velocity O VI found in this study. Descriptions of galaxy evolution will need to account for the highly ionized gas, and future X-ray studies of hot gas in the Local Group will need to consider carefully the relationship of the X-ray absorption/emission to the complex high-velocity absorption observed in O VI.'
>>> 

The problem is the "with an average of =40+/-13 km s-1" bit. Argh; this is not ivalid markup that got into the abstract somehow, but the author really wanting to say ''.