inveniosoftware / invenio

Invenio digital library framework
https://invenio.readthedocs.io
MIT License
625 stars 292 forks source link

INSPIRE: Outstanding issues with latex formats #847

Closed jrbl closed 9 years ago

jrbl commented 10 years ago

Originally on 2011-11-20

I fixed a lot of things in #841 (including #829) but not everything. The history there is becoming too intricate to follow and it has drifted from its original purpose. So I want to push that branch into production, mark that ticket as fixed, and then fix these remaining outstanding issues:

Records can get confused about %%CITATION, e.g.

%\cite{hep-th/9312104}
\bibitem{hep-th/9312104}
  E.~Witten,
  %``The Verlinde algebra and the cohomology of the Grassmannian,''
  In *Cambridge 1993, Geometry, topology, and physics* 357-422
  [hep-th/9312104].
  %%CITATION = .....,,;%%
}}
Even though there's an eprint number, the system does not give a %%CITATION.  Its not clear what those dots are about.

# While most things have %%CITATION, not everything does, e.g.:
{{{
%\cite{arXiv:1004.0616}
\bibitem{arXiv:1004.0616}
  R.~Longo and E.~Witten,
  %``An Algebraic Construction of Boundary Quantum Field Theory,''
  Commun.\  Math.\  Phys.\ \ {\bf 303} (2011) 213
  [arXiv:1004.0616 [math-ph]].

This appears to be happening (on DEV) only for things which have both a pubnote and an arxiv number, but for which the coden lookup fails. More investigation is needed.

invenio-developers commented 10 years ago

Originally by hoc on 2011-11-21

When there is an eprint number, the eprint should always be the %%CITATION. This means it won't change just because the paper got published (which makes it easier to keep track of things).

invenio-developers commented 10 years ago

Originally by hoc on 2011-11-22

%\cite{839659} \bibitem{839659} E.~Witten, %``Easing into QFT,'' Conf.\ Proc.\ C\ {\bf 0208124} (2002) 14. %%CITATION = CONFP,C0208124,14;%%

This is the exception to the volume rule. It should be: Conf.\ Proc.\ {\bf C0208124} (2002) 14. because the "volume" here is taken from the conference number C02-08-12.4 rather than there are series A, B and C of the journal Conf.Proc.

jrbl commented 10 years ago

Originally on 2011-11-23

Replying to [comment:2 hoc]:

%\cite{839659} \bibitem{839659} E.~Witten, %``Easing into QFT,'' Conf.\ Proc.\ C\ {\bf 0208124} (2002) 14. %%CITATION = CONFP,C0208124,14;%%

This is the exception to the volume rule. It should be: Conf.\ Proc.\ {\bf C0208124} (2002) 14. because the "volume" here is taken from the conference number C02-08-12.4 rather than there are series A, B and C of the journal Conf.Proc.

Is conf. proc. the only exception?

invenio-developers commented 10 years ago

Originally by hoc on 2011-11-23

Yes. It happens to Conf.Proc. because it's not a real journal. Real journals have A,B,C etc because they have different series (e.g. Phys.Rev.D is particle physics, Phys.Rev.C is nuclear physics).

jrbl commented 10 years ago

Originally on 2011-11-23

Replying to [comment:1 hoc]:

When there is an eprint number, the eprint should always be the %%CITATION. This means it won't change just because the paper got published (which makes it easier to keep track of things).

So by this do you mean that if we have both a valid coden lookup, and a valid arxiv number, %%CITATION should prefer the arxiv number?

invenio-developers commented 10 years ago

Originally by hoc on 2011-11-23

Replying to [comment:5 jblayloc]:

Replying to [comment:1 hoc]:

When there is an eprint number, the eprint should always be the %%CITATION. This means it won't change just because the paper got published (which makes it easier to keep track of things).

So by this do you mean that if we have both a valid coden lookup, and a valid arxiv number, %%CITATION should prefer the arxiv number?

Yes.

jrbl commented 10 years ago

Originally on 2011-11-23

Ok. To be explicit: we need %%CITATION to be ingestible by spires, and you're sure this won't break that?

Looking into:

{{{ %\cite{hep-th/9312104} \bibitem{hep-th/9312104} E.~Witten, %``The Verlinde algebra and the cohomology of the Grassmannian,'' In Cambridge 1993, Geometry, topology, and physics 357-422 [hep-th/9312104]. %%CITATION = .....,,;%% }}

...it seems that the problem is that '.....' is actually the coden for this. So if we're preferring arxiv numbers then that's probably not a problem on this specific record, but it leaves this low-priority question hanging out there whether we should go through the coden KB and excise things like '.....', '+++++', '*****' and '-----' in it?

invenio-developers commented 10 years ago

Originally by hoc on 2011-11-23

Anything can go into the CITATION field in INSPIRE, so these rare, weird ones shouldn't hurt it. I really don't understand why this record gets associated with '.....' though. Must be a difference between spires and inspire. in CODEN Short title: Astrophys.J.Lett. Coden: ..... this doesn't seem to have anything to do with this Witten record.

jrbl commented 10 years ago

Originally on 2011-11-28

For \cite and \bibitem, it is desired to stick with 035z where available. I intend to failover to the arxiv id in cases where 035z is missing; we should (not on this ticket) have a task to find everything without an 035z and generate one.

invenio-developers commented 10 years ago

Originally by hoc on 2011-12-01

An extra "\" is appearing after journal titles, e.g. Commun.\ Math.\ Phys.\ \ {\bf 303} (2011) 213 JHEP\ {\bf 1009} (2010) 092 which should be Commun.\ Math.\ Phys.\ {\bf 303}, 213 (2011) JHEP {\bf 1009}, 092 (2010)

While this doesn't really do anything to the final output it shouldn't really be there and users will notice (at least one has already written about it).

invenio-developers commented 10 years ago

Originally by hoc on 2012-02-13

Everything now seems fixed except

%\cite{Witten:2002zz} \bibitem{Witten:2002zz} E.~Witten, %``Easing into QFT,'' Conf.\ Proc.\ C {\bf 0208124}, 14 (2002). %%CITATION = CONFP,C0208124,14;%%

Should be: Conf.\ Proc.\ {\bf C0208124}, 14 (2002).

jalavik commented 9 years ago

LaTex output have recently been redone. Moved this issue to be checked if OK under INSPIRE channels.