Closed havardo closed 3 years ago
It happens because f. in Cymbalaria muralis G.Gaertn, B.Mey. & Schreb. f. toutonii (A.Chev.) Cuf.
might mean
forma
or filius
and there is no way to know which one it is
https://github.com/gnames/gnparser#names-with-filius-icn-code
I am closing this issue, as I do not have a good algorithm to go forward in distinguishing forma
and filius
in such cases. Gladly, such occations are pretty rare, and gnparser issues a warning when it happens. So I guess the solution for now to filter results by that warning and adjust results manually.
Thanks for a swift response @dimus and a fair point. The code of nomenclature is certainly flawed in this case. There might be a statistical argument of favouring forma
over filius
in cases like this, but as you say, they are rare.
Hi @dimus, I was just wondering if the principle below might work to identify forma
or filius
?
On the basis that all infraspecific names are lower case and not punctuated, we can say:
If an f.
is preceded by a lowercase text and the text is not a recognized rank nor precided by a puncuation mark (Accidental lower case authorship), we can assume that the f.
represents the forma
rank and not filius
.
As you say, these cases are rare.
Hi @dimus, I was just wondering if the principle below might work to identify
forma
orfilius
?On the basis that all infraspecific names are lower case and not punctuated, we can say:
If an
f.
is preceded by a lowercase text and the text is not a recognized rank nor precided by a puncuation mark (Accidental lower case authorship), we can assume that thef.
represents theforma
rank and notfilius
.As you say, these cases are rare.
@havardo, as I understand we already have this, can you give an example?
For example in tests in 'filius' section there is Amelanchier arborea f. hirsuta (Michx. f.) Fernald
where both f.
present. I think the problem with forma
and filius
arises in cases where even human cannot say what it is without preliminary knowledge.
Thanks for taking the time looking into this a bit further @dimus.
The examples below are all forma
and adhere to the principles described above.
The preceding text is lowercase and not punctuated (Hence, not author nor rank).
The parser is reporting them all as ambiguous
Sanguinaria canadensis L. f. multiplex (E.H.Wilson) Weath.
Rosa banksiae R.Br. f. lutescens Voss
Prunus cerasifera Ehrh. f. stipitata Bregadze
Cupressus obtusa (Siebold & Zucc.) F.Muell. f. formosana (Hayata) Clinton-Baker
For example, if we look at Sanguinaria canadensis L. f. multiplex (E.H.Wilson) Weath.
Without additional information, how can we conclude, that f.
means Sanguinaria canadensis forma multiplex
and not L. filius
?
The text multiplex
is written in lowercase and has now punctuation. We can therfore conclude that the text is neither authorship nor a rank. Hence, the text must be an infraspecific name. We can then conclude that the preceding f.
Is forma
and not filius
.
This approach will not work if an authorship is accidently written in lowercase, but then the name is noncompliant anyway.
We cannot be sure, for example Ficus aspera Forster f. nota Blanco
has Forster f.
as an author. However I think you are right that when such pattern happens, the probability that f.
means forma is much higher than if f.
means filius.
So it is better to parse f.
as forma, with the same warning as before.
I created a new issue at https://github.com/gnames/gnparser/issues/154
Parsing the name
Cymbalaria muralis G.Gaertn, B.Mey. & Schreb. f. toutonii (A.Chev.) Cuf.
appear not to successfully identify "f." as a rank.However, if "f." is replaced with "var.", the parsing is successful.
Cymbalaria muralis G.Gaertn, B.Mey. & Schreb. var. toutonii (A.Chev.) Cuf.
Examples using the rank form, but without authors, seems to work fine
Picea glauca var. albertiana f. conica
BTW: Great project !!!