identify at the level of <rank>

Ghini / ghini.desktop

plant collections manager (desktop version)

http://ghini.github.io/

GNU General Public License v2.0

24 stars 14 forks source link

identify at the level of <rank> #92

Open mfrasca opened 8 years ago

mfrasca commented 8 years ago

todo

let's start small …

[x] create ghini-3.2-dev branch
[x] create Taxonomy plugin
[x] define Taxon and Rank database classes
[x] create unit test suite
[ ] represent hybrids, as nothotaxon
[ ] represent hybrids, as formula
[ ] convert initialization data for Taxon and Rank tables from ghini.pocket taxon table
[ ] remove Plants plugin

see how far we get, decide following steps

original text

From @mfrasca on March 30, 2014 13:49

from a email conversation with @Ejgouda (working at the botanical garden in Utrecht and providing us scientific support). we posed a question and he answered this way: Q: for some of the problematic cases like 'quesito', 'cardón', rodilla de viejo', Saskia sort of knew the genus, but we still wonder what to write about the species A: We identify to genus level in this case and if a species is not certain a Identification qualifier (like cf.) can be added in the verification record

this is not the way bauble works.

currently bauble forces us to identify at the level of species, always. when we can identify the plant at the level of species, this goes fine, but in other cases we have to insert fictional data in the database.

more generic:

plant identified at (rank) level of genus:
- we insert a fictional 'sp' species in the genus.
plant identified at level of family:
- we insert a fictional 'problematic' genus in the family, then a fictional 'sp' species in the new fictional genus.

more specific:

plant identified at level of variety, form, subspecies:
- we create a new species, repeating the information relative to the common parent species, bud indicating that the species isn't really a species but it is a taxon at a more specific rank.

the last case has already been observed also in bauble.webapp and it has been addressed by renaming the 'species' table to 'taxon' (see Bauble/bauble.webapp#4), but this does not solve the need to repeat information when more specific information is available nor the need to insert fictional objects when the information available isn't specific enough.

Copied from original issue: mfrasca/bauble.classic#9

mfrasca commented 8 years ago

this is getting higher priority after first contact with Quito Botanical Garden. their need is different, but the source of the problem within bauble is the same and we can solve both issues... at the QBG they have a large collection of orchids. for Orchidaceae (but also for Leguminosae) it is common to have several steps between familia and genus, namely subfamilia, tribu, subtribu. these are completely specified once you have the genus, so one could argue they are useless once you have the genus, yet they help the reader group the different genera together.

mfrasca commented 8 years ago

@RoDuth

(I get a little concerned that your desire to "flatten" the data structure with "taxon" for family/genus/species is a fairly significant break from any off the botanic record systems I have seen over the years, but, as long as it is easy enough to get the data out in a ITF2/ABCD/etc. compatible format, e.g. for PlantSearch updates, I guess its no big deal?)

funny way to say "little concerned", while I guess you're just plain scared. am I planning to flatten the data structure? I would like to acknowledge that family/genus/species (but also tribe and variety) are all taxa, each with a rank and each with a parent taxon. so from my point of view then possibly yes I want to flatten things into one table. from your point of view I plan to offer you more levels (ranks) so actually the opposite of flattening.

mfrasca commented 8 years ago

mentioned in Bauble/bauble.classic#211

mfrasca commented 8 years ago

From @RoDuth on December 9, 2015 22:15

Accepted, and I do see the reasoning, its just such a major deviation from anything else I've seen. From my perspective it is a case of what is the data for and how do you use it... For me it is mainly just what is where, when was it planted and where did it come from. The taxonomic/nomenclature details are not something I need at hand necessarily but that’s maybe because I don't have particularly large collections of any specific taxonomic groups. Also, nomenclature is such a.... messy art that I just don't have the time (or need) to get bogged down in it. While we have collections such as Zingiberales, Bromeliaceae, Orchidaceae, Gymnospermae, Ferns and allies, we can always get what we need from the binomial and family. I'm no fan of "double handling" data that can be found elsewhere, especially when they can be in such a state of flux, so the binomial is the key for me, if I ever need the rest I can go looking for it.

For the issues you have raised regarding plants not yet identified I believe there are conventions in place, at least with the botanists I deal with here there are. The use of "botanist tag" names, arbitrary names at whichever level of identification. Something like RUB sp. (IGC1033) an unnamed plant likely from the family Rubiaceae or Actephila sp. Koumala (I.G Champion 870) (which has now been formally identified as Actephila championiae see here). I even have a family in the database called Unknown which is used when this is not even known and a GEN genus also. When it's an unknown variety etc. it is written as Dianella caerulea (Finch Hatton) or something like this. This is the way we currently communicate, Bauble does allow it and we are not likely to change even if there were other ways of doing it so for me I fear that such a radical change can only bring me trouble! But as I say as long as I can continue to do things the way I do now... no big deal. Right?

mfrasca commented 8 years ago

I even have a family in the database called Unknown which is used when this is not even known and a GEN genus also

at JBQ, for plants identified at level of family but unknown genus, I suggested entering a genus named like the family, but prefixed with a "Zzz-", missing the trailing 'e', and obviously belonging to the family. in this genus we have the likewise unknown species "sp". eg: »Zzz-orchidacea sp« I guess I would name your Unknown family "Zzz-plantaceae" :smile: , then the genus I would probably call Zzz-planta and the species (the one you can finally associate to an accession) »Zzz-planta sp«

the reason for that leading zzz is lexicographic: 'unknown', or 'problematic' (my first guess at the Cuchubo garden 2 years ago) would go somewhere in the middle, messing up the group of better identified plants.

at JBQ they told me they would be happy being able to group things by subfamilies, tribes and subtribes (they focus on Orchidaceae), because some taxonomists specialize in such sections of that huge family. so it would be nice for them, they said, to be able to show a list of plants sorted by subtribe, for example.

the solution I have offered is based on the exporting facility: the template will add subtribe information to the genus, then tribe information to subtribes and subfamily information to tribes. it has to be hard coded in the template. the template is public and you can have a look at it.

mfrasca commented 8 years ago

From @RoDuth on December 10, 2015 16:8

The examples of "botanist tags" I provided above are common practice with the taxonomists in Australia, they are all "real world" examples and may be found in published reference material so we stick to this format. When a formal identification turns up in one of the reliable sources (Qld Herb Recs, APNI, etc.) we switch to that and the "botanist tag names" become synonyms because you still may come across them in old literature. I have never thought of the fact that they may end up in the middle lexicographically as an issue because we see them as valid binomials (until a better comes along) anyway. I can see the point though.
Does the JBQ collection not have similarly "botanist tag" named plants? Who does there identifications? Do they have a taxonomist either as staff or which they work with? Do they voucher? Are we talking wild collected, yet to be identified plants or plants that have lost there names to time and poor record keeping? Species or potentially hybrids/cultivars etc.? My thought on it is that the Binomial is the key to all that other data, its the equivalent of the accession number and if you have that right Binomial then so much of the other data is fairly easily accessible by clicking the relevant "link" in the info pane... but then... I don't have the Orchid problem... thankfully!! :laughing: As I said we do have some of these sorts of collection for which it is common to use the intermediate groupings (Broms, Orchids, Cycads, etc.), I just haven't had a need for them and hence any need to keep them in the database. Family I do use a lot but nothing much more than that. I'm assuming that JBQ DO use these intermediate groupings. Do you know what for?

mfrasca commented 8 years ago

@felipead87, @TatiJaramilloV, read the above comment, it's a lot of questions for either of you, or for Lucho, but he doesn't yet have an account here.

mfrasca commented 8 years ago

@RoDuth , where do you put this "botanist tag"? can you paste here some information from your database? or the screenshot of a view on your data?

mfrasca commented 8 years ago

From @RoDuth on December 10, 2015 23:2

@felipead87, @TatiJaramilloV I just re-read my questions above, sorry if they sound a little presumptuous, it definitely wasn't my intent. I am just interested in the differences of end users needs and uses. Its more a matter of interest than anything. When we surveyed gardens in Australia and New Zealand about their database needs and wants the differences could be quite large and I wonder if there may be similarities with some of those who responded to our survey?
@mfrasca I don't have our complete database at hand right now but here are 2, not so great examples of how we use "botanist tags" for names screenshot from 2015-12-11 07 59 43 The first is just the place holder for complete unknowns that have not been seen by a botanist yet. The 2 accessions are 2 different ferns from a collection trip that we (or any of our local experts) could not ID to any level confidently. We will grow these on until they are large enough to get material to back voucher with the Qld Herbarium. Once identified they will be renamed in Bauble but GEN sp. will NOT be considered a synonym (although it will be recorded in a change note of course). The second is a similar situation of a plant that we collected but have sent a voucher of to the Queensland Herbarium and this is the name it came back as. It has obviously proven to be an unrecorded species and will at some point in the future get a formal ID but for now it is referred to as RUB sp. nov (IGC1033) in correspondence, the Qld Herbarium Records, etc.. When it does receive a name RUB sp. nov (IGC1033) WILL be recorded as a synonym. Here is how it looks in the species editor. Not ideal as we would normally records Author (the Queensland Herbarium Botanist that assessed our voucher and returned the ID) but as I said I only have a small subset of our data on hand right now (am at home). screenshot from 2015-12-11 08 01 39

Hope that makes sense.

mfrasca commented 8 years ago

From @RoDuth on December 12, 2015 13:5

mfrasca commented 6 years ago

something like … Rank

name	depth	short	shows_as	defines
1	"regnum"	""	0	".epithet sp."	"regnum"
2	"divisio"	""	2	".epithet sp."	"divisio"
3	"classis"	""	4	".epithet sp."	"classis"
4	"ordo"	""	8	"[.epithet] sp."	"ordo"
5	"familia"	"Fam."	10	"[.epithet] sp."	"familia"
6	"subfamilia"	"Subfam."	14	"[.epithet] sp."	"subfamilia"
7	"tribus"	"Tr."	16	".epithet sp."	""
8	"subtribus"	"Subtr."	18	".epithet sp."	""
9	"genus"	"Gen."	20	"[.epithet] sp."	"genus"
10	"subgenus"	"Subgen."	25	".genus subg. .epithet sp."	""
11	"sectio"	"Sec."	30	".genus sec. .epithet sp."	""
12	"subsectio"	"Subsec."	35	".genus subsec. .epithet sp."	""
13	"species"	"sp."	40	"[.ranked_name .epithet]"	"binomial"
14	"subspecies"	"subsp."	45	".binomial subsp. .epithet"	""
15	"varietas"	"var."	50	".binomial var. .epithet"	""
16	"forma"	"f."	55	".binomial f. .epithet"	""
17	"cultivar"	"cv"	99	".complete '.epithet'"	""

Taxon

rank	epithet	author	year	parent	accepted

ask a Taxon to show itself, it consults its corresponding Rank, in particular the shows_as field. the shows_as is split by spaces, all parts that start with a . are considered a field, all other parts are considered verbatim. if a field is defined at that rank, fine, otherwise look in the parent. the field defines helps stop the upwards lookup. the part in [] is the replacement for the locally defined field.

mfrasca commented 6 years ago

there's no control on typesetting, but it does compute representations as described in #79

mfrasca commented 6 years ago

so: when you identify a plant to rank any other rank but species or infraspecific, the new plugin gives you identification in this form:

    tax = self.session.query(Taxon).filter_by(epithet='Annona').first()
    self.assertEquals(tax.show(), 'Annona sp.')
    tax = self.session.query(Taxon).filter_by(epithet='Tilioideae').first()
    self.assertEquals(tax.show(), 'Tilioideae sp.')
    tax = self.session.query(Taxon).filter_by(epithet='Tiliaceae').first()
    self.assertEquals(tax.show(), 'Tiliaceae sp.')

when you insert a new species, which has not yet been completely correctly placed in the taxonomic derivation, and even start producing cultivars from it, you may get things like:

    sp_nov = Taxon(rank=self.sp_nov, parent=cucurbita, epithet='IGC1033')
    self.assertEquals(sp_nov.show(), 'Cucurbita sp. nov. (IGC1033)')
    sp_nov = Taxon(rank=self.sp_nov, parent=cucurbitaceae, epithet='IGC1034')
    self.assertEquals(sp_nov.show(), 'Cucurbitaceae sp. nov. (IGC1034)')
    cv = Taxon(rank=self.cultivar, parent=sp_nov, epithet='Lekker Bek')
    self.assertEquals(cv.show(), "Cucurbitaceae sp. nov. (IGC1034) 'Lekker Bek'")
    sp_nov = Taxon(rank=self.sp_nov, parent=self.plantae, epithet='IGC1035')
    self.assertEquals(sp_nov.show(), 'Plantae sp. nov. (IGC1035)')

mfrasca commented 6 years ago

considering a hint by @RoDuth, check the Australian style for a new unpublished species in a new unpublished genus, and doing as if we were also developing cultivars for them.

asterales = Taxon(rank=self.ordo, parent=self.plantae, epithet='Asterales')
asteraceae = Taxon(rank=self.familia, parent=asterales, epithet='Asteraceae')
gen_nov = Taxon(rank=self.genus, parent=asteraceae, nov_code='Aq520454')
sp_nov = Taxon(rank=self.species, parent=gen_nov, nov_code='D.A.Halford Q811', nov_name='Shute Harbour')
cv = Taxon(rank=self.cultivar, parent=sp_nov, epithet='Due di Denari')
self.assertEquals(sp_nov.show(), 'Gen. (Aq520454) sp. Shute Harbour (D.A.Halford Q811)')
self.assertEquals(gen_nov.show(), 'Gen. (Aq520454) sp.')
self.assertEquals(cv.show(), "Gen. (Aq520454) sp. Shute Harbour (D.A.Halford Q811) 'Due di Denari'")

mfrasca commented 6 years ago

@RoDuth , a taxon is a taxon, and its representation is its representation. I'm not confusing the object with its name, I know that a species has to be printed as its binomial plus authorship for species epithet. and I know that infraspecific taxa want the binomial plus the deepest infraspecific rank in its derivation. the current implementation (1.0), if you ask me, is a set of mistakes on top of each other, each fixing one aspect without addressing the mistake. I'm trying to fix the original sin, that's it.

the "automatic" change you mention is interesting!

if an infraspecific taxon T1 at rank R is created of a species T0 then T0 automatically gets a ranked epithet at rank R that mirrors the species epithet. E.g. when they decided that the Araucaria cunninghamii from PNG was a variety and named it Araucaria cunninghamii var. papuana then all other Araucaria cunninghamii automatically became Araucaria cunninghamii var. cunninghamii.

what about authorship?

does this also means that you never explicitly create an infraspecific taxon T2 of species T0 where the epithet mirrors the species epithet?

Ejgouda commented 6 years ago

The best is that you can have a few settings for different use (label, screen etc.) :

name with or without author for the actual epithet (including basionym author if a recombination)
name include one or more levels of infraspecific ranks, for example (3) /Aechmea //distichantha //var. distichantha forma distichantha/ can also named as (max 2) /Aechmea distichantha forma distichantha/, because the infraspecific epithet should be unique.
setting what to do with identifications on (sub)genus level and above, for example add 'spec.' to the genus name

The best implementation is to have a hierarchical tree of equal taxon records, each with only one epithet and rank and a name algorithm to calculate the name in each level, including subgenera etc. in the tree. To make it more complicated, beside a taxon parent link, you could have taxon mother and father links for natural hybrids, with or without an epithet.

Just how we solved this, and highly flexible.

Cheers, Eric

Op 12-10-18 om 16:31 schreef Mario Frasca:

@RoDuth https://github.com/RoDuth , a taxon is a taxon, and its representation is its representation. I'm not confusing the object with its name, I know that a species has to be printed as its binomial plus authorship for species epithet. and I know that infraspecific taxa want the binomial plus the deepest infraspecific rank in its derivation. the current implementation (1.0), if you ask me, is a set of mistakes on top of each other, each fixing one aspect without addressing the mistake. I'm trying to fix the original sin, that's it.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/Ghini/ghini.desktop/issues/92#issuecomment-429346021, or mute the thread https://github.com/notifications/unsubscribe-auth/AAdSyinQc9KHmg_xY0hmZ5lL1Gn27KBWks5ukKe9gaJpZM4IHFSH.

{"api_version":"1.0","publisher":{"api_key":"05dde50f1d1a384dd78767c55493e4bb","name":"GitHub"},"entity":{"external_key":"github/Ghini/ghini.desktop","title":"Ghini/ghini.desktop","subtitle":"GitHub repository","main_image_url":"https://assets-cdn.github.com/images/email/message_cards/header.png","avatar_image_url":"https://assets-cdn.github.com/images/email/message_cards/avatar.png","action":{"name":"Open in GitHub","url":"https://github.com/Ghini/ghini.desktop"}},"updates":{"snippets":[{"icon":"PERSON","message":"@mfrasca in #92: @RoDuth , a taxon is a taxon, and its representation is its representation.\r\nI'm not confusing the object with its name, I know that a species has to be printed as its binomial plus authorship for species epithet. and I know that infraspecific taxa want the binomial plus the deepest infraspecific rank in its derivation.\r\nthe current implementation (1.0), if you ask me, is a set of mistakes on top of each other, each fixing one aspect without addressing the mistake.\r\nI'm trying to fix the original sin, that's it."}],"action":{"name":"View Issue","url":"https://github.com/Ghini/ghini.desktop/issues/92#issuecomment-429346021"}}}[ { "@context": "http://schema.org", "@type": "EmailMessage", "potentialAction": { "@type": "ViewAction", "target": "https://github.com/Ghini/ghini.desktop/issues/92#issuecomment-429346021","url": "https://github.com/Ghini/ghini.desktop/issues/92#issuecomment-429346021", "name": "View Issue" }, "description": "View this Issue on GitHub", "publisher": { "@type": "Organization", "name": "GitHub", "url": "https://github.com" } }, { "@type": "MessageCard", "@context": "http://schema.org/extensions", "hideOriginalBody": "false", "originator": "AF6C5A86-E920-430C-9C59-A73278B5EFEB", "title": "Re: [Ghini/ghini.desktop] identify at the level of \u003crank\u003e (#92)", "sections": [ { "text": "", "activityTitle": "Mario Frasca", "activityImage": "https://assets-cdn.github.com/images/email/message_cards/avatar.png", "activitySubtitle": "@mfrasca", "facts": [ ] } ], "potentialAction": [ { "name": "Add a comment", "@type": "ActionCard", "inputs": [ { "isMultiLine": true, "@type": "TextInput", "id": "IssueComment", "isRequired": false } ], "actions": [ { "name": "Comment", "@type": "HttpPOST", "target": "https://api.github.com", "body": "{\n\"commandName\": \"IssueComment\",\n\"repositoryFullName\": \"Ghini/ghini.desktop\",\n\"issueId\": 92,\n\"IssueComment\": \"{{IssueComment.value}}\"\n}" } ] }, { "name": "Close issue", "@type": "HttpPOST", "target": "https://api.github.com", "body": "{\n\"commandName\": \"IssueClose\",\n\"repositoryFullName\": \"Ghini/ghini.desktop\",\n\"issueId\": 92\n}" }, { "targets": [ { "os": "default", "uri": "https://github.com/Ghini/ghini.desktop/issues/92#issuecomment-429346021"} ], "@type": "OpenUri", "name": "View on GitHub" }, { "name": "Unsubscribe", "@type": "HttpPOST", "target": "https://api.github.com", "body": "{\n\"commandName\": \"MuteNotification\",\n\"threadId\": 136074375\n}" } ], "themeColor": "26292E" } ]

mfrasca commented 6 years ago

Hi @Ejgouda, nice to hear from you!

a hierarchical tree of equal taxon records, each with only one epithet and rank and a name algorithm to calculate the name in each level, including subgenera etc. in the tree.

this is indeed what I did in the new Taxonomy plugin.
have you got some concrete examples, both easy and complicated, against which I could try the logic I've designed? the idea is to build logic which can be completed with database data. (https://github.com/Ghini/ghini.desktop/issues/92#issuecomment-427621096)

this "automatic infraspecific" I still need to design. it would mean that when you have a taxon at rank species, and it has taxa referring to it from lower ranks, it should … well, I still don't know, I have to think of it. it sounds strange anyway. I mean, the fact that a Aechmea minor f. ubi-major exists (making your A. minor automatically needing ›f. minor‹) does not automatically mean that you have it in your database. not only: you may know that this forma ubi-major exists, but you do not have it in your database, so you do need, don't you?, to explicitly insert the A. minor f. minor in the database. as said, interesting, but with still unclear consequences.

Ejgouda commented 6 years ago

Hi Mario,

If you want, I can send you the algorithm I have designed for it.

Below you are talking about autonyms, names that become active after a new (the first) infraspecific taxon has been described and do not have an author. How you solve this kind of names in your detabase can be in different ways. You need to suppress them the moment that infraspecific taxon becomes synonym and no other infraspecific taxa do exist. You can make Aechmea minor forma minor synonym of Aechmea minor, to solve all internal problems, but to the outside you have to suppress them in any output.

So the moment Aechmea minor f. ubi-major (don't think the hyphen is right here) becomes a synonym of an other taxon Aechmea minor f. minor does not exist any longer, but in the database there are probably many references pointing to that record, like identifications etc. I solve this problem by considering Aechmea minor f. minor synonym to Aechmea minor by making it synonym to the later. For example an accession record with an identification to the forma, will automatically point to the right taxon Aechmea minor, because the identification points to a synonym record. The record need to stay in the database, not to loose any information, but to the outer world it does not exist.

An other way could be not to have a physical record for authonyms, but to use a flag that for example an identification points to a autonym of the taxon record. So the moment Aechmea minor forma ubi-major has been created, you show the autonym in the taxon selection list, but there is no physical record.

At the moment, I have chosen for the first option, but maybe the second option is better, but then you no longer can use extended description records for autonyms, which is often used in literature (something I don't like very much, but will solve solve other problems).

Eric

Op 13-10-18 om 04:17 schreef Mario Frasca:

Hi @Ejgouda https://github.com/Ejgouda, nice to hear from you!
a hierarchical tree of equal taxon records, each with only one
epithet and rank and a name algorithm to calculate the name in
each level, including subgenera etc. in the tree.
this is indeed what I did in the new |Taxonomy| plugin. have you got some concrete examples, both easy and complicated, against which I could try the logic I've designed? the idea is to build logic which can be completed with database data. (#92 (comment) https://github.com/Ghini/ghini.desktop/issues/92#issuecomment-427621096)

this "automatic infraspecific" I still need to design. it would mean that when you have a taxon at rank species, and it has taxa referring to it from lower ranks, it should … well, I still don't know, I have to think of it. it sounds strange anyway. I mean, the fact that a /Aechmea minor/ f. /ubi-major/ exists (making your /A. minor/ automatically needing ›f. /minor/‹) does not automatically mean that you have it in your database. not only: you may know that this forma /ubi-major/ exists, but you do not have it in your database, so you do need, don't you?, to explicitly insert the A. minor f. minor in the database. as said, interesting, but with still unclear consequences.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/Ghini/ghini.desktop/issues/92#issuecomment-429503485, or mute the thread https://github.com/notifications/unsubscribe-auth/AAdSylh3FUHk6BGwBNUwBIxia-zGCjGJks5ukU06gaJpZM4IHFSH.

{"api_version":"1.0","publisher":{"api_key":"05dde50f1d1a384dd78767c55493e4bb","name":"GitHub"},"entity":{"external_key":"github/Ghini/ghini.desktop","title":"Ghini/ghini.desktop","subtitle":"GitHub repository","main_image_url":"https://assets-cdn.github.com/images/email/message_cards/header.png","avatar_image_url":"https://assets-cdn.github.com/images/email/message_cards/avatar.png","action":{"name":"Open in GitHub","url":"https://github.com/Ghini/ghini.desktop"}},"updates":{"snippets":[{"icon":"PERSON","message":"@mfrasca in #92: Hi @Ejgouda, nice to hear from you!\r\n\u003ea hierarchical tree of equal taxon records, each with only one epithet and rank and a name algorithm to calculate the name in each level, including subgenera etc. in the tree.\r\n\r\nthis is indeed what I did in the new Taxonomy plugin. \r\nhave you got some concrete examples, both easy and complicated, against which I could try the logic I've designed? the idea is to build logic which can be completed with database data. (https://github.com/Ghini/ghini.desktop/issues/92#issuecomment-427621096)\r\n\r\nthis \"automatic infraspecific\" I still need to design. it would mean that when you have a taxon at rank species, and it has taxa referring to it from lower ranks, it should … well, I still don't know, I have to think of it. it sounds strange anyway. I mean, the fact that a Aechmea minor f. ubi-major exists (making your A. minor automatically needing ›f. minor‹) does not automatically mean that you have it in your database. not only: you may know that this forma ubi-major exists, but you do not have it in your database, so you do need, don't you?, to explicitly insert the A. minor f. minor in the database. as said, interesting, but with still unclear consequences."}],"action":{"name":"View Issue","url":"https://github.com/Ghini/ghini.desktop/issues/92#issuecomment-429503485"}}}[ { "@context": "http://schema.org", "@type": "EmailMessage", "potentialAction": { "@type": "ViewAction", "target": "https://github.com/Ghini/ghini.desktop/issues/92#issuecomment-429503485","url": "https://github.com/Ghini/ghini.desktop/issues/92#issuecomment-429503485", "name": "View Issue" }, "description": "View this Issue on GitHub", "publisher": { "@type": "Organization", "name": "GitHub", "url": "https://github.com" } }, { "@type": "MessageCard", "@context": "http://schema.org/extensions", "hideOriginalBody": "false", "originator": "AF6C5A86-E920-430C-9C59-A73278B5EFEB", "title": "Re: [Ghini/ghini.desktop] identify at the level of \u003crank\u003e (#92)", "sections": [ { "text": "", "activityTitle": "Mario Frasca", "activityImage": "https://assets-cdn.github.com/images/email/message_cards/avatar.png", "activitySubtitle": "@mfrasca", "facts": [ ] } ], "potentialAction": [ { "name": "Add a comment", "@type": "ActionCard", "inputs": [ { "isMultiLine": true, "@type": "TextInput", "id": "IssueComment", "isRequired": false } ], "actions": [ { "name": "Comment", "@type": "HttpPOST", "target": "https://api.github.com", "body": "{\n\"commandName\": \"IssueComment\",\n\"repositoryFullName\": \"Ghini/ghini.desktop\",\n\"issueId\": 92,\n\"IssueComment\": \"{{IssueComment.value}}\"\n}" } ] }, { "name": "Close issue", "@type": "HttpPOST", "target": "https://api.github.com", "body": "{\n\"commandName\": \"IssueClose\",\n\"repositoryFullName\": \"Ghini/ghini.desktop\",\n\"issueId\": 92\n}" }, { "targets": [ { "os": "default", "uri": "https://github.com/Ghini/ghini.desktop/issues/92#issuecomment-429503485"} ], "@type": "OpenUri", "name": "View on GitHub" }, { "name": "Unsubscribe", "@type": "HttpPOST", "target": "https://api.github.com", "body": "{\n\"commandName\": \"MuteNotification\",\n\"threadId\": 136074375\n}" } ], "themeColor": "26292E" } ]

mfrasca commented 6 years ago

If you want, I can send you the algorithm I have designed for it.

that would be very interesting! you would share it under the same license as Ghini, that is GPL3+. your algorithm, you also implemented it, in TaxaSoft, didn't you? why don't you publish? I'm curious to compare your work with mine and the rest I've been seeing around. one would even build a pocket-server to make your taxasoft talk to ghini.pocket.

I am considering the sequence of actions, and wondering of the naming consequences:

insert species A: R. officinalis
insert forma F: R. officinalis f. minor
- automatically A shows as R officinalis f. officinalis
insert subspecies S: R. officinalis subsp. foetida
- automatically A shows as ??? what?
- F keeps the same indication, but how do you know it's a form of A and not of S ?

the opposite sequence sounds less of a problem:

insert species A: R. officinalis
insert subspecies S: R. officinalis subsp. foetida
- automatically A shows as R. officinalis subsp. officinalis
insert forma F1 and F2, one of A, one of S.
- what limitations you have in choosing the two forma epithets?
- how would the two F1 and F2 show? would they mention the subspecies rank epithet?

Ejgouda commented 6 years ago

Hi Mario,

I'm working at the moment on a conversion of our Atlantis BG database to MySQL, using a database dump of 16 GB of semi XML.

Following function is from my research database and could be different than the one I have on github

The function takes a Taxon record as argument, hope you know the syntax of SQL:

Eric

function calculate_full_name( $row)
{
     if( !$row['id'])
         return '';

     if( !$row['rank'])
         die( "calculate_full_name Error: Record has no rank");

//    $epithet= $row['hyphen'] ? substr( $row['epithet'], 0, 
$row['hyphen']).'-'.substr( $row['epithet'], $row['hyphen']) : 
$row['epithet'];
     $epithet=  $row['epithet'];

     if( $row['rank'] == 'unranked' || $row['rank'] == 'genus' || 
$row['rank'] == 'subfam.'  || $row['rank'] == 'tribus'  || $row['rank'] 
== 'subtrib.'   || $row['rank'] == 'family' )
         return $row['epithet'];
     else if( trim( $row['hyb']) != '' && trim($row['epithet']) == '' 
)    //* calculate hybrid formula of the correct names
     {
         $row1= mysqli_fetch_array( query( "SELECT taxon.*, correct.name 
as cname FROM taxon
             LEFT JOIN taxon as correct ON 
taxon.correct_taxon_id=correct.id WHERE taxon.id={$row['mother']};" )) 
or $row1['name']='?';
         $row2= mysqli_fetch_array( query( "SELECT taxon.*, correct.name 
as cname FROM taxon
             LEFT JOIN taxon as correct ON 
taxon.correct_taxon_id=correct.id  WHERE taxon.id={$row['father']};" )) 
or $row2['name']='?';
         $array= explode( ' ', ($row2['cname'] ? $row2['cname'] : 
$row2['name']));
         $dupl= '';

         while( $part= array_shift( $array))
         {
             $dupl= trim( "$dupl $part");

             if( substr( ($row1['cname'] ? $row1['cname'] : 
$row1['name']), 0, strlen( $dupl)) != $dupl)
                 break;
         }

         return ($row1['cname'] ? $row1['cname'] : $row1['name']).' 
'.$row['hyb'].' '.trim($part.' '.implode(' ', $array));
     }
     else         //* calculate normal names
     {
         $rank= $row['rank'];

         if( $rank == 'var.' || $rank == 'forma' || $rank == 'subsp.')
             $name= "$rank $epithet";
         else
             $name= $epithet;

         if( $row['hyb'] != '')    //* x comes at the end of name
             $name .= " {$row['hyb']}";

         while( $row && $row['rank'] != 'genus' && $row['parent_taxon_id'])
         {
             $result= query("SELECT * FROM taxon WHERE 
id={$row['parent_taxon_id']}" , "BF 372");

             if( $row= mysqli_fetch_array( $result))
             {
                 if( $row['rank'] == 'genus' || $row['rank'] == 'species')
                 {
                     if( $rank == 'section' || $rank == 'subgen.')
                         $name= "$name ($rank of {$row['epithet']})";
                     else if( $row['hyb'])
                         $name= $row['epithet']." {$row['hyb']} $name";
                     else
                         $name= $row['epithet']." $name";
                 }
             }
             else
                 die( "calculate_full_name Error: Parent with 
id({$row['parent_taxon_id']}) not found ->$query");

             echoDebug( "Found: {$row['id']} {$row['rank']} 
{$row['epithet']} ->$query<br>");
         }

         if( $row['rank'] != 'genus' )
             die( "calculate_full_name Error: Name record with id($id) 
has no parent with genus rank");

         return $name;
     }
}

Op 13-10-18 om 17:10 schreef Mario Frasca:

If you want, I can send you the algorithm I have designed for it.
that would be very interesting! you would share it under the same license as Ghini, that is GPL3+. your algorithm, you also implemented it, in TaxaSoft, didn't you? why don't you publish? I'm curious to compare your work with mine and the rest I've been seeing around. one would even build a pocket-server to make your taxasoft talk to ghini.pocket.

I am considering the sequence of actions, and wondering of the naming consequences:

insert species A: R. officinalis

insert forma F: R. officinalis f. minor o automatically A shows as R officinalis f. officinalis

insert subspecies S: R. officinalis subsp. foetida o automatically A shows as ??? what? o F keeps the same indication, but how do you know it's a form of A and not of S ?

the opposite sequence sounds less of a problem:

insert species A: R. officinalis

insert subspecies S: R. officinalis subsp. foetida o automatically A shows as R. officinalis subsp. officinalis

insert forma F1 and F2, one of A, one of S. o what limitations you have in choosing the two forma epithets? o how would the two F1 and F2 show? would they mention the subspecies rank epithet?

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/Ghini/ghini.desktop/issues/92#issuecomment-429549572, or mute the thread https://github.com/notifications/unsubscribe-auth/AAdSyoWmwqrhuZ-cJ6hyM3dEXhKUdP94ks5ukgJ_gaJpZM4IHFSH.

{"api_version":"1.0","publisher":{"api_key":"05dde50f1d1a384dd78767c55493e4bb","name":"GitHub"},"entity":{"external_key":"github/Ghini/ghini.desktop","title":"Ghini/ghini.desktop","subtitle":"GitHub repository","main_image_url":"https://assets-cdn.github.com/images/email/message_cards/header.png","avatar_image_url":"https://assets-cdn.github.com/images/email/message_cards/avatar.png","action":{"name":"Open in GitHub","url":"https://github.com/Ghini/ghini.desktop"}},"updates":{"snippets":[{"icon":"PERSON","message":"@mfrasca in #92: \u003e If you want, I can send you the algorithm I have designed for it.\r\n\r\nthat would be very interesting! you would share it under the same license as Ghini, that is GPL3+.\r\nyour algorithm, you also implemented it, in TaxaSoft, didn't you?\r\nwhy don't you publish? I'm curious to compare your work with mine and the rest I've been seeing around. one would even build a pocket-server to make your taxasoft talk to ghini.pocket.\r\n\r\n------------------------------------------------\r\n\r\nI am considering the sequence of actions, and wondering of the naming consequences:\r\n- insert species A: R. officinalis\r\n- insert forma F: R. officinalis f. minor\r\n - automatically A shows as R officinalis f. officinalis\r\n- insert subspecies S: R. officinalis subsp. foetida\r\n - automatically A shows as ??? what?\r\n - F keeps the same indication, but how do you know it's a form of A and not of S ?\r\n\r\nthe opposite sequence sounds less of a problem:\r\n- insert species A: R. officinalis\r\n- insert subspecies S: R. officinalis subsp. foetida\r\n - automatically A shows as R. officinalis subsp. officinalis \r\n- insert forma F1 and F2, one of A, one of S.\r\n - what limitations you have in choosing the two forma epithets?\r\n - how would the two F1 and F2 show? would they mention the subspecies rank epithet?"}],"action":{"name":"View Issue","url":"https://github.com/Ghini/ghini.desktop/issues/92#issuecomment-429549572"}}}[ { "@context": "http://schema.org", "@type": "EmailMessage", "potentialAction": { "@type": "ViewAction", "target": "https://github.com/Ghini/ghini.desktop/issues/92#issuecomment-429549572","url": "https://github.com/Ghini/ghini.desktop/issues/92#issuecomment-429549572", "name": "View Issue" }, "description": "View this Issue on GitHub", "publisher": { "@type": "Organization", "name": "GitHub", "url": "https://github.com" } }, { "@type": "MessageCard", "@context": "http://schema.org/extensions", "hideOriginalBody": "false", "originator": "AF6C5A86-E920-430C-9C59-A73278B5EFEB", "title": "Re: [Ghini/ghini.desktop] identify at the level of \u003crank\u003e (#92)", "sections": [ { "text": "", "activityTitle": "Mario Frasca", "activityImage": "https://assets-cdn.github.com/images/email/message_cards/avatar.png", "activitySubtitle": "@mfrasca", "facts": [ ] } ], "potentialAction": [ { "name": "Add a comment", "@type": "ActionCard", "inputs": [ { "isMultiLine": true, "@type": "TextInput", "id": "IssueComment", "isRequired": false } ], "actions": [ { "name": "Comment", "@type": "HttpPOST", "target": "https://api.github.com", "body": "{\n\"commandName\": \"IssueComment\",\n\"repositoryFullName\": \"Ghini/ghini.desktop\",\n\"issueId\": 92,\n\"IssueComment\": \"{{IssueComment.value}}\"\n}" } ] }, { "name": "Close issue", "@type": "HttpPOST", "target": "https://api.github.com", "body": "{\n\"commandName\": \"IssueClose\",\n\"repositoryFullName\": \"Ghini/ghini.desktop\",\n\"issueId\": 92\n}" }, { "targets": [ { "os": "default", "uri": "https://github.com/Ghini/ghini.desktop/issues/92#issuecomment-429549572"} ], "@type": "OpenUri", "name": "View on GitHub" }, { "name": "Unsubscribe", "@type": "HttpPOST", "target": "https://api.github.com", "body": "{\n\"commandName\": \"MuteNotification\",\n\"threadId\": 136074375\n}" } ], "themeColor": "26292E" } ]

mfrasca commented 6 years ago

Eric, I have some idea of SQL ;-) isn't it a bit like asking you if you know of plants … I don't particularly enjoy deciphering PHP/Perl, but I will survive, your logic is clear enough.

in your query, you do select *, then only use a couple of fields. you might want to optimize that.
I don't see what you do for nothotaxa
what is the role of BF 372?
I suppose that $row['hyb'] is either empty or a ×, or can it be H and +?
you might want not to reuse a parameter name as a local variable ($row).
is there anything like PEP8 in Perl/PHP?
assigning a numerical value to ranks might help with tests like is_infraspecific($row['rank']), and would help make your code more flexible.
I'm surprised to see iteration where I would expect recursion (going up the taxonomy) — Perl can do object oriented, can't it?
I don't follow how you do sections.
when you identify an accession at rank above species, you want to add a trailing sp., don't you? where do you do that?
when you identify an accession with some doubt ('forsan', '?', 'cfr', etc), where do you store that? and how do you put that in the computed string?
you're executing the same query twice, but building it twice from data. I suppose you have some reason not to use the placeholder ?

are you able to share some data, so that I can see your piece of code at work?

mfrasca commented 6 years ago

ops, sorry, I meant to edit your post, so that it would show the code as pre-formatted, but this is not an option for comments sent from email.

Ejgouda commented 6 years ago

I did not reply to the list, so resent

Op 14-10-18 om 22:37 schreef Mario Frasca:

Eric, I have some idea of SQL ;-) isn't it a bit like asking you if you know of plants … I don't particularly enjoy deciphering PHP/Perl, but I will survive, your logic is clear enough.

in your query, you do |select *|, then only use a couple of fields. you might want to optimize that.

Hi Mario

For one record this is not needed.

For a general search in my database interface for example, I search all fields in a table for the searched text and any result is within 0.1 sec, even on my specimen table with over 265K records. Just started to do this and there was never any reason to change that, but I also have an advanced search in which I can search in field combinations etc.

I don't see what you do for nothotaxa

Nothotaxa just have a Genus name, like all others, a full list of all Bromeliaceae names you can find here: http://bromeliad.nl/taxonlist All calculated with that algorithm (nothotaxa at the end)

what is the role of |BF 372|?

What is BF 372?

I suppose that |$row['hyb']| is either empty or a |×|, or can it be |H| and |+|?

Yes, standard ITF2

you might want not to reuse a parameter name as a local variable (|$row|).

Why not? It is a copy of the array on the stack. I use that for field functions, like name calculation when a record is saved.

is there anything like PEP8 in Perl/PHP?

PEP8 does not ring a bel for me

assigning a numerical value to ranks might help with tests like |is_infraspecific($row['rank'])|, and would help make your code more flexible.

I'm surprised to see iteration where I would expect recursion (going up the taxonomy) — Perl can do object oriented, can't it?

I work in PHP only and use recursion where needed. And PHP can do object oriented, but mostly I do not use that

are you able to share some data, so that I can see your piece of code at work?

See above http://bromeliad.nl/taxonlist, or http://bromeliad.nl/encyclopedia

In the last one you can see the parent taxa (mostly subgenus) between brackets

Eric

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/Ghini/ghini.desktop/issues/92#issuecomment-429660272, or mute the thread https://github.com/notifications/unsubscribe-auth/AAdSygVJIjpMGXrMCaj9tgCpGtr6qwWBks5uk6B1gaJpZM4IHFSH.

{"api_version":"1.0","publisher":{"api_key":"05dde50f1d1a384dd78767c55493e4bb","name":"GitHub"},"entity":{"external_key":"github/Ghini/ghini.desktop","title":"Ghini/ghini.desktop","subtitle":"GitHub repository","main_image_url":"https://assets-cdn.github.com/images/email/message_cards/header.png","avatar_image_url":"https://assets-cdn.github.com/images/email/message_cards/avatar.png","action":{"name":"Open in GitHub","url":"https://github.com/Ghini/ghini.desktop"}},"updates":{"snippets":[{"icon":"PERSON","message":"@mfrasca in #92: Eric, I have some idea of SQL ;-) isn't it a bit like asking you if you know of plants …\r\nI don't particularly enjoy deciphering PHP/Perl, but I will survive, your logic is clear enough. \r\n\r\n- in your query, you do select *, then only use a couple of fields. you might want to optimize that.\r\n- I don't see what you do for nothotaxa\r\n- what is the role of BF 372?\r\n- I suppose that $row['hyb'] is either empty or a ×, or can it be H and +?\r\n- you might want not to reuse a parameter name as a local variable ($row).\r\n- is there anything like PEP8 in Perl/PHP?\r\n- assigning a numerical value to ranks might help with tests like is_infraspecific($row['rank']), and would help make your code more flexible.\r\n- I'm surprised to see iteration where I would expect recursion (going up the taxonomy) — Perl can do object oriented, can't it?\r\n\r\nare you able to share some data, so that I can see your piece of code at work?"}],"action":{"name":"View Issue","url":"https://github.com/Ghini/ghini.desktop/issues/92#issuecomment-429660272"}}}[ { "@context": "http://schema.org", "@type": "EmailMessage", "potentialAction": { "@type": "ViewAction", "target": "https://github.com/Ghini/ghini.desktop/issues/92#issuecomment-429660272","url": "https://github.com/Ghini/ghini.desktop/issues/92#issuecomment-429660272", "name": "View Issue" }, "description": "View this Issue on GitHub", "publisher": { "@type": "Organization", "name": "GitHub", "url": "https://github.com" } }, { "@type": "MessageCard", "@context": "http://schema.org/extensions", "hideOriginalBody": "false", "originator": "AF6C5A86-E920-430C-9C59-A73278B5EFEB", "title": "Re: [Ghini/ghini.desktop] identify at the level of \u003crank\u003e (#92)", "sections": [ { "text": "", "activityTitle": "Mario Frasca", "activityImage": "https://assets-cdn.github.com/images/email/message_cards/avatar.png", "activitySubtitle": "@mfrasca", "facts": [ ] } ], "potentialAction": [ { "name": "Add a comment", "@type": "ActionCard", "inputs": [ { "isMultiLine": true, "@type": "TextInput", "id": "IssueComment", "isRequired": false } ], "actions": [ { "name": "Comment", "@type": "HttpPOST", "target": "https://api.github.com", "body": "{\n\"commandName\": \"IssueComment\",\n\"repositoryFullName\": \"Ghini/ghini.desktop\",\n\"issueId\": 92,\n\"IssueComment\": \"{{IssueComment.value}}\"\n}" } ] }, { "name": "Close issue", "@type": "HttpPOST", "target": "https://api.github.com", "body": "{\n\"commandName\": \"IssueClose\",\n\"repositoryFullName\": \"Ghini/ghini.desktop\",\n\"issueId\": 92\n}" }, { "targets": [ { "os": "default", "uri": "https://github.com/Ghini/ghini.desktop/issues/92#issuecomment-429660272"} ], "@type": "OpenUri", "name": "View on GitHub" }, { "name": "Unsubscribe", "@type": "HttpPOST", "target": "https://api.github.com", "body": "{\n\"commandName\": \"MuteNotification\",\n\"threadId\": 136074375\n}" } ], "themeColor": "26292E" } ]