inukshuk / bibtex-ruby

A BibTeX library, parser, and converter for Ruby.
http://inukshuk.github.com/bibtex-ruby
GNU General Public License v3.0
156 stars 31 forks source link

Latex filter issues #28

Closed etc closed 13 years ago

etc commented 13 years ago

Using convert(:latex) on a bibliography currently has mixed success. In particular, the following LaTeX commands are not always correctly filtered:

I am puzzled by this, as for some of these commands, latex-decode by itself appears to give the correct results. For example:

ruby-1.9.2-p290 :007 > LaTeX.decode "\\\"{e}"
 => "ë"

The following transcript gives an example that exhibits all of these problems (it involves one call to BibTeX.open and then one call to BibTeX.convert; a search for "\" will show where the problems lie):

ruby-1.9.2-p290 :001 > require 'bibtex'
 => true 
ruby-1.9.2-p290 :002 > b = BibTeX.open('./latex.bib')
 => @book{proust_1996,
  address = {Paris},
  author = {Proust, Jo\"{e}lle},
  booktitle = {Perception et Intermodalit\'{e}: Approches Actuelles De La Question De Molyneux},
  date-added = {2011-10-02 12:51:45 -0400},
  date-modified = {2011-10-02 12:51:45 -0400},
  editor = {Proust, Jo\"{e}lle},
  keywords = {Perception; Molyneux's Problem},
  publisher = {Presses Universitaires de France},
  title = {Perception et Intermodalit\'{e}: Approches Actuelles De La Question De Molyneux},
  year = {1996}
}
@incollection{bach-y-rita_1996,
  abstract = {Paul Bach-y-Rita (chapter 20) discusses his work on tactile-vision substitution systems. This research suggests that it is possible to ``see'' by means of tactile sensations, if these sensations are appropriately embedded within a sensorimotor framework.},
  author = {{Bach-y-Rita}, Paul},
  crossref = {proust_1996},
  date-added = {2011-10-02 12:51:26 -0400},
  date-modified = {2011-10-02 12:51:26 -0400},
  keywords = {Perception; Molyneux's Problem; Vision},
  note = {Reprinted in translation in \textcite[pp. 497--514]{noe_2002}.},
  pages = {81--100},
  title = {Substitution Sensorielle et Qualia}
}
@article{noe_2008,
  author = {No\"{e}, Alva},
  date-added = {2011-10-02 12:44:40 -0400},
  date-modified = {2011-10-02 12:44:40 -0400},
  journal = {Philosophy and Phenomenological Research},
  keywords = {Perception; Enactivism; Vision},
  month = {may},
  number = {3},
  pages = {660--665},
  title = {Pr\'{e}cis of \emph{Action in Perception}},
  url = {http://dx.doi.org/10.1111/j.1933-1592.2008.00161.x},
  volume = {76},
  year = {2008}
}
@article{bermudez_2007,
  author = {Berm\'{u}dez, Jos\'{e} Luis},
  date-added = {2011-10-02 12:43:54 -0400},
  date-modified = {2011-10-02 12:43:54 -0400},
  journal = {Philosophical Perspectives},
  keywords = {Nonconceptual Content; Mind; Perception},
  month = {dec},
  number = {1},
  pages = {55--72},
  title = {What is at Stake in the Debate on Nonconceptual Content?},
  url = {http://dx.doi.org/10.1111/j.1520-8583.2007.00120.x},
  volume = {21},
  year = {2007}
}
@book{ellegard_1958,
  address = {G\"{o}teborg},
  author = {Elleg{\aa}rd, Alvar},
  booktitle = {Darwin and the General Reader: The Reception of Darwin's Theory of Evolution in the British Periodical Press, 1859---1972},
  date-added = {2011-10-02 12:41:36 -0400},
  date-modified = {2011-10-02 12:42:40 -0400},
  keywords = {Darwin; History of Biology; History of Science; Sociology of Science},
  note = {Reprinted by University of Chicago Press.},
  publisher = {G\"{o}teborg Universitets {\AA}rsskrift},
  title = {Darwin and the General Reader: The Reception of Darwin's Theory of Evolution in the British Periodical Press, 1859--1972},
  volume = {64},
  year = {1958}
}
@article{haggqvist_2007,
  abstract = {It is widely held that the meaning of certain types of terms, such as natural kind terms, is individuated externalistically, in terms of the individual's external environment. Recently a more radical thesis has emerged, a thesis we dub `a posteriori semantics.' The suggestion is that not only does a term's meaning depend on the external environment, but so does its semantics. One motivation for this is the aim to account for cases where a putative natural kind term fails to pick out a natural kind: The term may have a standard externalist semantics (if it picks out a natural kind) or a more descriptivist one (if it does not). Knowing which semantics applies will therefore require detailed empirical knowledge. This move has also been employed in cases where a singular term, such as a name, fails to have a reference. We argue that a posteriori semantics is inherently implausible, since the type of semantics common terms should be given ought not to be conditional on details of chemistry or physics. A number of difficulties for the position---`metaphysical,' epistemological, and methodological---are articulated. Finally, we suggest that a posteriori semantics misconstrues the way in which semantics is empirical.},
  author = {H\"{a}ggqvist, S\"{o}ren and \"{A}sa Maria Wikforss},
  date-added = {2011-10-02 12:41:11 -0400},
  date-modified = {2011-10-02 12:41:11 -0400},
  journal = {Erkenntnis},
  keywords = {Externalism; Content; Mind},
  month = {nov},
  number = {3},
  pages = {373--386},
  title = {Externalism and A Posteriori Semantics},
  url = {http://dx.doi.org/10.1007/s10670-007-9051-4},
  volume = {67},
  year = {2007}
}
@article{hajek_1996,
  abstract = {According to finite frequentism, the probability of an attribute A in a finite reference class B is the relative frequency of actual occurrences of A within B. I present fifteen arguments against this position.},
  author = {H\'{a}jek, Alan},
  date-added = {2011-10-02 12:40:45 -0400},
  date-modified = {2011-10-02 12:40:45 -0400},
  journal = {Erkenntnis},
  keywords = {Probability},
  month = {nov},
  number = {2-3},
  pages = {209-227},
  title = {``Mises redux''---Redux: Fifteen Arguments against Finite Frequentism},
  volume = {45},
  year = {1996}
}
@article{bergstrom_1970a,
  author = {Bergstr\"{o}m, Ingvar},
  date-added = {2011-10-02 12:39:58 -0400},
  date-modified = {2011-10-02 12:39:58 -0400},
  journal = {Oud Holland},
  keywords = {Holland; 17C; History of Art},
  number = {1-4},
  pages = {143-157},
  title = {De Gheyn as a \emph{Vanitas} Painter},
  url = {http://dx.doi.org/10.1163/187501770X00112},
  volume = {85},
  year = {1970}
}
@incollection{bricmont_2001,
  address = {Heidelberg},
  author = {Bricmont, Jean and D\"{u}rr, Detlef and Galavotti, Maria C. and Ghirardi, Giancarlo and Petruccione, Francesco and Zangh\`{i}, Nino},
  booktitle = {Chance in Physics: Foundations and Perspectives},
  date-added = {2011-10-02 12:39:05 -0400},
  date-modified = {2011-10-02 12:39:05 -0400},
  editor = {Bricmont, Jean and D\"{u}rr, Detlef and Galavotti, Maria C. and Ghirardi, Giancarlo and Petruccione, Francesco and Zangh\`{i}, Nino},
  keywords = {Philosophy of Science; Physics; Probability; Quantum Mechanics; Thermodynamics},
  publisher = {Springer},
  series = {Lecture Notes in Physics},
  title = {Chance in Physics: Foundations and Perspectives},
  year = {2001}
}
@article{bowler_1975,
  author = {Bowler, Peter J.},
  date-added = {2011-10-02 12:38:00 -0400},
  date-modified = {2011-10-02 12:38:00 -0400},
  journal = {Journal of the History of Ideas},
  keywords = {History of Biology; History of Science},
  month = {mar},
  number = {1},
  pages = {95--114},
  title = {The Changing Meaning of ``Evolution''\,},
  url = {http://dx.doi.org/10.2307/2709013},
  volume = {36},
  year = {1975}
}
@article{wood_1995,
  author = {Wood, Christopher S.},
  date-added = {2011-10-02 12:36:54 -0400},
  date-modified = {2011-10-02 12:36:54 -0400},
  issue = {October-December},
  journal = {Word and Image},
  keywords = {History of Art; Holland; 17C; Curiosity},
  number = {4},
  pages = {332-352},
  title = {\,`Curious Pictures' and the Art of Description},
  volume = {11},
  year = {1995}
}
@article{worrall_2000a,
  abstract = {Having been neglected or maligned for most of this century, Newton's method of 'deduction from the phenomena' has recently attracted renewed attention and support. John Norton, for example, has argued that this method has been applied with notable success in a variety of cases in the history of physics and that this explains why the massive underdetermination of theory by evidence, seemingly entailed by hypothetico-deductive methods, is invisible to working physicists. This paper, through a detailed analysis of Newton's deduction of one particular 'proposition' in optics 'from the phenomena', gives a clearer account than hitherto of the method - highlighting the fact that it is really one of deduction from the phenomena plus 'background knowledge'. It argues, that, although the method has certain heuristic virtues, examination of its putative accreditational strengths reveals a range of important problems that its defenders have yet adequately to address.
  },
  author = {Worrall, John},
  date-added = {2011-10-02 12:36:13 -0400},
  date-modified = {2011-10-02 12:36:13 -0400},
  journal = {British Journal for the Philosophy of Science},
  keywords = {Newton; Underdetermination; Confirmation; Induction; Scientific Method; Philosophy of Science},
  month = {mar},
  number = {1},
  pages = {45--80},
  title = {The Scope, Limits, and Distinctiveness of the Method of `Deduction from the Phenomena': Some Lessons from Newton's `Demonstrations' in Optics},
  url = {http://dx.doi.org/10.1093/bjps/51.1.45},
  volume = {51},
  year = {2000}
}

ruby-1.9.2-p290 :004 > b.convert(:latex)
 => @book{proust_1996,
  address = {Paris},
  author = {Proust, Jo\"{e}lle},
  booktitle = {Perception et Intermodalité: Approches Actuelles De La Question De Molyneux},
  date-added = {2011-10-02 12:51:45 -0400},
  date-modified = {2011-10-02 12:51:45 -0400},
  editor = {Proust, Jo\"{e}lle},
  keywords = {Perception; Molyneux's Problem},
  publisher = {Presses Universitaires de France},
  title = {Perception et Intermodalité: Approches Actuelles De La Question De Molyneux},
  year = {1996}
}
@incollection{bach-y-rita_1996,
  abstract = {Paul Bach-y-Rita (chapter 20) discusses his work on tactile-vision substitution systems. This research suggests that it is possible to ``see'' by means of tactile sensations, if these sensations are appropriately embedded within a sensorimotor framework.},
  author = {{Bach-y-Rita}, Paul},
  crossref = {proust_1996},
  date-added = {2011-10-02 12:51:26 -0400},
  date-modified = {2011-10-02 12:51:26 -0400},
  keywords = {Perception; Molyneux's Problem; Vision},
  note = {Reprinted in translation in \textcite[pp. 497–514]noe_2002.},
  pages = {81–100},
  title = {Substitution Sensorielle et Qualia}
}
@article{noe_2008,
  author = {No\"{e}, Alva},
  date-added = {2011-10-02 12:44:40 -0400},
  date-modified = {2011-10-02 12:44:40 -0400},
  journal = {Philosophy and Phenomenological Research},
  keywords = {Perception; Enactivism; Vision},
  month = {may},
  number = {3},
  pages = {660–665},
  title = {Précis of \emphAction in Perception},
  url = {http://dx.doi.org/10.1111/j.1933-1592.2008.00161.x},
  volume = {76},
  year = {2008}
}
@article{bermudez_2007,
  author = {Berm\'{u}dez, Jos\'{e} Luis},
  date-added = {2011-10-02 12:43:54 -0400},
  date-modified = {2011-10-02 12:43:54 -0400},
  journal = {Philosophical Perspectives},
  keywords = {Nonconceptual Content; Mind; Perception},
  month = {dec},
  number = {1},
  pages = {55–72},
  title = {What is at Stake in the Debate on Nonconceptual Content?},
  url = {http://dx.doi.org/10.1111/j.1520-8583.2007.00120.x},
  volume = {21},
  year = {2007}
}
@book{ellegard_1958,
  address = {Göteborg},
  author = {Elleg{\aa}rd, Alvar},
  booktitle = {Darwin and the General Reader: The Reception of Darwin's Theory of Evolution in the British Periodical Press, 1859—1972},
  date-added = {2011-10-02 12:41:36 -0400},
  date-modified = {2011-10-02 12:42:40 -0400},
  keywords = {Darwin; History of Biology; History of Science; Sociology of Science},
  note = {Reprinted by University of Chicago Press.},
  publisher = {Göteborg Universitets \AArsskrift},
  title = {Darwin and the General Reader: The Reception of Darwin's Theory of Evolution in the British Periodical Press, 1859–1972},
  volume = {64},
  year = {1958}
}
@article{haggqvist_2007,
  abstract = {It is widely held that the meaning of certain types of terms, such as natural kind terms, is individuated externalistically, in terms of the individual's external environment. Recently a more radical thesis has emerged, a thesis we dub `a posteriori semantics.' The suggestion is that not only does a term's meaning depend on the external environment, but so does its semantics. One motivation for this is the aim to account for cases where a putative natural kind term fails to pick out a natural kind: The term may have a standard externalist semantics (if it picks out a natural kind) or a more descriptivist one (if it does not). Knowing which semantics applies will therefore require detailed empirical knowledge. This move has also been employed in cases where a singular term, such as a name, fails to have a reference. We argue that a posteriori semantics is inherently implausible, since the type of semantics common terms should be given ought not to be conditional on details of chemistry or physics. A number of difficulties for the position—`metaphysical,' epistemological, and methodological—are articulated. Finally, we suggest that a posteriori semantics misconstrues the way in which semantics is empirical.},
  author = {H\"{a}ggqvist, S\"{o}ren and \"{A}sa Maria Wikforss},
  date-added = {2011-10-02 12:41:11 -0400},
  date-modified = {2011-10-02 12:41:11 -0400},
  journal = {Erkenntnis},
  keywords = {Externalism; Content; Mind},
  month = {nov},
  number = {3},
  pages = {373–386},
  title = {Externalism and A Posteriori Semantics},
  url = {http://dx.doi.org/10.1007/s10670-007-9051-4},
  volume = {67},
  year = {2007}
}
@article{hajek_1996,
  abstract = {According to finite frequentism, the probability of an attribute A in a finite reference class B is the relative frequency of actual occurrences of A within B. I present fifteen arguments against this position.},
  author = {H\'{a}jek, Alan},
  date-added = {2011-10-02 12:40:45 -0400},
  date-modified = {2011-10-02 12:40:45 -0400},
  journal = {Erkenntnis},
  keywords = {Probability},
  month = {nov},
  number = {2-3},
  pages = {209-227},
  title = {``Mises redux''—Redux: Fifteen Arguments against Finite Frequentism},
  volume = {45},
  year = {1996}
}
@article{bergstrom_1970a,
  author = {Bergstr\"{o}m, Ingvar},
  date-added = {2011-10-02 12:39:58 -0400},
  date-modified = {2011-10-02 12:39:58 -0400},
  journal = {Oud Holland},
  keywords = {Holland; 17C; History of Art},
  number = {1-4},
  pages = {143-157},
  title = {De Gheyn as a \emphVanitas Painter},
  url = {http://dx.doi.org/10.1163/187501770X00112},
  volume = {85},
  year = {1970}
}
@incollection{bricmont_2001,
  address = {Heidelberg},
  author = {Bricmont, Jean and D\"{u}rr, Detlef and Galavotti, Maria C. and Ghirardi, Giancarlo and Petruccione, Francesco and Zangh\`{i}, Nino},
  booktitle = {Chance in Physics: Foundations and Perspectives},
  date-added = {2011-10-02 12:39:05 -0400},
  date-modified = {2011-10-02 12:39:05 -0400},
  editor = {Bricmont, Jean and D\"{u}rr, Detlef and Galavotti, Maria C. and Ghirardi, Giancarlo and Petruccione, Francesco and Zangh\`{i}, Nino},
  keywords = {Philosophy of Science; Physics; Probability; Quantum Mechanics; Thermodynamics},
  publisher = {Springer},
  series = {Lecture Notes in Physics},
  title = {Chance in Physics: Foundations and Perspectives},
  year = {2001}
}
@article{bowler_1975,
  author = {Bowler, Peter J.},
  date-added = {2011-10-02 12:38:00 -0400},
  date-modified = {2011-10-02 12:38:00 -0400},
  journal = {Journal of the History of Ideas},
  keywords = {History of Biology; History of Science},
  month = {mar},
  number = {1},
  pages = {95–114},
  title = {The Changing Meaning of ``Evolution''\,},
  url = {http://dx.doi.org/10.2307/2709013},
  volume = {36},
  year = {1975}
}
@article{wood_1995,
  author = {Wood, Christopher S.},
  date-added = {2011-10-02 12:36:54 -0400},
  date-modified = {2011-10-02 12:36:54 -0400},
  issue = {October-December},
  journal = {Word and Image},
  keywords = {History of Art; Holland; 17C; Curiosity},
  number = {4},
  pages = {332-352},
  title = {\,`Curious Pictures' and the Art of Description},
  volume = {11},
  year = {1995}
}
@article{worrall_2000a,
  abstract = {Having been neglected or maligned for most of this century, Newton's method of 'deduction from the phenomena' has recently attracted renewed attention and support. John Norton, for example, has argued that this method has been applied with notable success in a variety of cases in the history of physics and that this explains why the massive underdetermination of theory by evidence, seemingly entailed by hypothetico-deductive methods, is invisible to working physicists. This paper, through a detailed analysis of Newton's deduction of one particular 'proposition' in optics 'from the phenomena', gives a clearer account than hitherto of the method - highlighting the fact that it is really one of deduction from the phenomena plus 'background knowledge'. It argues, that, although the method has certain heuristic virtues, examination of its putative accreditational strengths reveals a range of important problems that its defenders have yet adequately to address.
  },
  author = {Worrall, John},
  date-added = {2011-10-02 12:36:13 -0400},
  date-modified = {2011-10-02 12:36:13 -0400},
  journal = {British Journal for the Philosophy of Science},
  keywords = {Newton; Underdetermination; Confirmation; Induction; Scientific Method; Philosophy of Science},
  month = {mar},
  number = {1},
  pages = {45–80},
  title = {The Scope, Limits, and Distinctiveness of the Method of `Deduction from the Phenomena': Some Lessons from Newton's `Demonstrations' in Optics},
  url = {http://dx.doi.org/10.1093/bjps/51.1.45},
  volume = {51},
  year = {2000}
}

ruby-1.9.2-p290 :005 > 
etc commented 13 years ago

It would also be nice to properly convert ` and '' to “ and ”, and likewise for single quotes. Apologies if any of this is outside the scope of whatlatex-decodeis intended to do; I didn't know if you wanted it to be as general purpose as, for example, parsing\emph{}markup. (In case you didn't know about\,, it is the code for thin space, UnicodeU+2009`).

inukshuk commented 13 years ago

I suspect that this is caused by improper handling of backslashes somewhere in the parsing process. I'll try to fix it ASAP; meanwhile, it would be great if you could convert the examples to proper cucumber features – that way we can make sure there will be no regressions for this issue.

Originally, the scope of latex-decode was to include all possible conversions of LaTeX directives that can be represented by Unicode; for that reason, \emph is problematic, because we'd have to decide which markup to convert to.

By the way, fascinating stuff, your bibliography ;-)

etc commented 13 years ago

I am happy to write some associated cucumber features—but will probably not get to it for a few days. (All the samples I'm using are taken from https://github.com/etc/philosophy-bibliography)

inukshuk commented 13 years ago

Brad!

Thanks for the test cases and my apologies for not having been able to look at this sooner.

At first I suspected this to be an issue with string escape sequences in the parser (those can be quite annoying to track down), but as it turns out, the issue was that names needed a little extra treatment, because regular values consist of tokens, but names consist of name tokens and each name token, again, consists of the individual name parts.

Additionally, you were using a number of conversions latex-decode wasn't aware of. I've added most of them, but I feel a little bit uneasy about the quotation mark replacements. Would you expect every single ' to be converted (as your test cases seem to suggest)? I have implemented it that way for now, but I wonder if this is indeed the right approach.

I have yet to add support for the thin space. Meanwhile, could you take a look if the solution works for you? Your features should all pass, except for one: is it Äsa Maria Wikforss or Åsa Maria Wikforss? I suspect the latter, but I thought I'll better ask you, as you may want to fix that in the bibliography.

To test you will need to issue

$ [sudo] bundle install

in order to fetch the latest latex-decode.

inukshuk commented 13 years ago

Brad,

I've added \, support to latex-decode, but I'll wait with a new gem release, in case there are any other symbols, diacritics etc., which you need but which are not supported yet.

About the \emph issue: we could include a LaTeX to HTML (or something else) converter to BibTeX Ruby proper; I imagine this may be quite useful if you are processing the entries directly (plus, citeproc and citeproc-js generally try to handle HTML tags gracefully, too). What do you say?

inukshuk commented 13 years ago

Just pushed 2.0.1; when using latex-decode 0.0.7 the \, conversion should be supported, too. I'll close the issue for now, please reopen if I've missed something.

On a different note: is there any bibtex or cite-formatting related support you need with maldini? I've been meaning to take a look at that, as I'll need to integrate academic citations with a website later this year.

alexeymuranov commented 13 years ago

\c c was not mentioned, it does not work for me.

inukshuk commented 13 years ago

Alex, the c-cedilla should be supported by latex-decode. This works for me:

$ gem i latex-decode
Successfully installed latex-decode-0.0.7
1 gem installed
Installing ri documentation for latex-decode-0.0.7...
Installing RDoc documentation for latex-decode-0.0.7...
$ irb -r "latex/decode"
001:0> LaTeX.decode '\c{c}'
=> "ç"
002:0> LaTeX.decode '\c{C}'
=> "Ç"

I was using Ruby 1.9.2 but it is supposed to work on other versions as well. Please report back if this example does not work.

alexeymuranov commented 13 years ago

Thanks! I'll use latex-decode then. I just thought that convert(:latex) should have taken care of it.

alexeymuranov commented 13 years ago

It does not accept though the syntax without curly brackets:

pry(main)> LaTeX.decode '\c{C}'
=> "Ç"
pry(main)> LaTeX.decode '\c C'
=> "\\c C"

I have those in my bibliography.

inukshuk commented 13 years ago

The Latex filter uses the latex-decode gem, so whatever works there, should work in bibtex-ruby, too.

If the syntax without curly braces is supported by latex, we can add it to latex-decode (it is very easy to add conversions there).

----- Reply message ----- From: "Alexey" reply@reply.github.com Date: Mon, Oct 24, 2011 7:43 pm Subject: [bibtex-ruby] Latex filter issues (#28) To: "Sylvester Keil" sk@semicolon.at

It does not accept though the syntax without curly brackets:

pry(main)> LaTeX.decode '\c{C}'
=> "Ç"
pry(main)> LaTeX.decode '\c C'
=> "\\c C"

I have those in my bibliography.

Reply to this email directly or view it on GitHub: https://github.com/inukshuk/bibtex-ruby/issues/28#issuecomment-2506687

alexeymuranov commented 13 years ago

I will look into it and try to give a minimal example. It seems that in bibliography i have Author = {... Fran{\c c}ois ...}, but after convert(:latex) it becomes Fran\c cois. Both are acceptable in LaTeX, but i think that in BibTeX bibliography fields it is common, if not necessary, to surround by curly brackets.

alexeymuranov commented 13 years ago

Here is how LaTeX.decode behaves with '{\c c}': instead of replacing it with 'ç', it simply strips the curved brackets:

pry(main)> puts LaTeX.decode('Fran{\c c}ois')
Fran\c cois
inukshuk commented 13 years ago

Can you point to any official LaTeX documentation that describes this syntax you are using? If I run this through xelatex I get a 'control sequence undefined' error.

If this is no valid syntax we can't add it to latex-decode, however, you can easily write your own filter: just pass an object that responds to :apply to #convert or, alternatively, write a class that inherits from BibTeX::Filter and implements #apply – then you can call your filter by name:

 class MyFilter < BibTeX::Filter
   def apply(input)
     input.gsub(/\{\\c c\}/, 'ç')
   end
 end

Now you can use your filter on a Bibliography object with bib.convert_myfilter or bib.convert(:myfilter). Alternatively, as I said above, you can pass in any object that responds to :apply; take a look at this test case for an example.

Anyway, please let me know if the {\c c} syntax is indeed valid LaTeX I'll add it to latex-decode asap.

alexeymuranov commented 13 years ago

Thanks for the explanation and suggested workarounds. I will try to see if i can find some "official" documentation. However, this syntax is valid to the best of my knowledge, it is a basic TeX syntax, which is supported in LaTeX. There are commands which accept a single token as argument, and this token is just the first token or the first group inside {``} than follow the command, possibly after a sequence of spaces. This is what i vaguely remember from Knuth's TeXbook. The following example works for me in tex-live LaTeX:

\documentclass{article}
\begin{document}
\c c
\c {abc}
\' c
\end{document}

The bibliography with which i am dealing was downloaded from HAL archive. It contains {\c c}. As far as i understand, curly brackets in BibTeX fields have additional meaning, besides the usual one --- separating a group (like comments in config files that actually contain metadata to be processed): they tell BibTeX to not post-process what is inside (to not change the case in particular).

I'll look for something official.

inukshuk commented 13 years ago

I just checked one more time using latex and you're absolutely right, everything works. We'll add support for this to latex-decode.

alexeymuranov commented 13 years ago

Thank you. I think the best way would be to implement the standard TeX parsing rules. LaTeX documentation usually suggests using brackets, but they are not needed for parsing.

inukshuk commented 13 years ago

Alexey, I pushed latex-decode 0.0.8 which hopefully solves the problem:

mbp:latex-decode$ irb -r 'latex/decode'
001:0> LaTeX.decode '\c C'
=> "Ç"
002:0> LaTeX.decode '\c {cbc}'
=> "çbc"
003:0> LaTeX.decode '\c cab'
=> "çab"
004:0> LaTeX.decode '{\c c}'
=> "ç"

As I don't currently require the LaTeX filter myself, I haven't done any extensive testing; fingers crossed that this doesn't break any other conversions. If you have any other issues which pertain strictly to latex decoding, please post them over at the project repository.

Thanks for reporting this issue!

alexeymuranov commented 13 years ago

Thank you. However, i suggest that latex-decode better ignore '\c {cbc}' altogether, as the output you show does not resemble my pdfLaTeX's output :).

inukshuk commented 13 years ago

You're right, that doesn't make sense – we can't actually reproduce LaTeX's behaviour in that instance using unicode, I don't think. ;-)