Open GoogleCodeExporter opened 9 years ago
I have to say - I quite like the relations without 'is' and with underscores,
as this serves to mark them out clearly from plain English terms that have the
same name and makes the structure of Manchester Syntax statements clearer.
I assume that the argument for adding and initial 'is', is that is makes OBO
relationships and OWL MS look more like English (if you ignore the odd quoting):
e.g.
finger 'is part of' some hand
is a bit more like English than
finger 'part of' some hand
But sometimes the 'is' makes DL queries less readable. Say I want to write a
DL query that refers to anonymous class without specifying a genus:
Arguably:
'is located in' some ('is part of' some skull)
is less readable than
'is located in' some ('part of' some skull)
And if what we want to get across that
finger 'part of' some hand means
all finger(s) are 'part of' some hand - then perhaps the version lacking the
'is' is less misleading?
This is even more apparent when referring to relations in free text. Compare:
"finger 'is part of' some hand means all finger(s) are 'is part of' some hand"
to
"finger part_of some hand means that all fingers are part_of some hand"
Really, the only way to make the former readable is to abandon using the
relation name altogether and rely on the plain English equivalent:
"finger 'is part of' some hand means all finger(s) are part of some hand"
This is probably OK with 'part of' but is likely to be misleading in other
cases, given that relations often have much more specialised meanings than the
regular English words and phrases used to name them.
Original comment by dosu...@gmail.com
on 20 May 2012 at 7:24
The motivation was first that there be a uniform way we write relations. Since
our current set of relations are a mix of grammatical forms the proposal was to
make this uniform. In addition, as you point out, there may be issues about how
labels are read in english. However it seems these issues cut both ways, and
that adopting a consistent way of choosing the names, will, in the long run,
make it easier for users to understand what they see. By users, I mean
developers, btw. If you want to present ontologies to end users you will
(always) have to do more work to ensure that the results are colloquially
understandable.
BTW, Barry, in private communication, asks why we don't use underscores in
relation names. I answered him by quoting his paper
"Survey-based naming conventions for use in OBO Foundry ontology development"
Naming Conventions
Our proposed set of naming conventions, founded on the survey results, is
summarized in Table 1. In further discus- sions, we refer to the entities of
which an ontology consists (in some circles these are called classes and
relations) as its representational units [19]. A representational unit can be
accompanied by one or more synonymous names of different categories. Any type
of name that is chosen to be displayed in the hierarchy is called 'display
name' (called 'browser key' in Protégé). Where the form of that name is
controlled by a set of explicit rules we refer to it as a 'formal name'. To
ensure that the conventions proposed here are expressed unambiguously we employ
the following additional name categories, which we hope will also have general
utility:
...
3.3 Use the bar space (' ') character as word separator, just as it would
normally appear in the language of choice. Where use of the bar space is not
allowed by the type of representational unit in use to store a name, the
underscore ('_') should be used instead. Camel case should not be used as a
means of word separation.
Original comment by alanruttenberg@gmail.com
on 20 May 2012 at 8:04
It may be worth noting that this paper is dated 2009, and the survey itself was
conducted among 66 people in 2007. It also refers to things that have evolved,
such as Protege 3's way of displaying classes by rdf:ID.
In past discussions with other heavy developers, and in my own experience, not
using underscore is a pain, especially considering the auto complete feature
(for which you Alansubmitted another tracks suggesting to add a new annotation
property) As mentioned, there will anyway be extra work to provide a nice human
friendly user name, so why not make things easier on the developer and just use
underscores? No need for yet an extra duplication of the label as yet a new
annotation, and no need for a new way to handle things in Protege. I also agree
with David that it makes things much more readable, whether in Manchester
syntax, papers or general written communication.
Original comment by mcour...@gmail.com
on 20 May 2012 at 9:40
I agree with Melanie, I personally greatly prefer the underscores in relations
as I find it easier in various autocomplete functions and less confusing for
new users in distinguishing classes from properties.
Original comment by haen...@ohsu.edu
on 20 May 2012 at 9:52
+1 for using underscores and dropping the "is".
Another argument for doing it this way is precedence - I know we shouldn't be
bound by legacy, but "part_of" is the de-facto standard label in dozens of
ontologies some of which have been around >10 years. This is the form that has
been published in however many papers, including the 2005 OBO-Relations paper.
Original comment by cmung...@gmail.com
on 20 May 2012 at 9:57
I don't see why all these issues can't be resolved with the proposal I put
forth in issue 32. I think it sets a bad example to stick to what is
effectively jargon, and to be ruled by one authoring tool that could do its job
better (google knows how to complete terms with spaces without messing with
quotes, for example). However if the overall sentiment on this is consistent
with the few comments so far, I will instead propose that we add a new
alternative term - something like 'natural language string' and have that label
be the one with consistent and proper english labels.
Again, I see this as intrusion of user interface into ontology best practices.
No english speaker uses underscores in their usual language, and most try to
use verbs consistently. The completion software can (and should) be fixed to
make your lives easy. Here are some ways
1) stop insisting on quotes for completion. Google manages to get away with a
single character in g+, and then de-emphasizes it typographically when you are
finished completing.
2) Have an option where protege understands during completion that when typing
an underscore it should be considered a space if there is no term that has an
underscore and there is one with a space.
3) Have protege be more generous with completion choices - offering matches in
the middle of terms below matches from the front, and offering recently chosen
completions at the top of the list when they match, and match on first letters
of words, so that ipo or po (matching from the middle) both offer is part of.
4) Make it supereasy to add abbreviations - see e.g. http://www.typeit4me.com/.
So someone should be able to easily say "I want you to offer me is part of
every time I'm in a relation completion context and type an initial p."
Original comment by alanruttenberg@gmail.com
on 21 May 2012 at 1:19
In this case the discussion is not so much about what Protege supports or not,
but what people working on and with those resources feel comfortable with.
Several of us are of the opinion that we should be using underscores, for
reason of readability among others. The countergument was the paper by Schober
et al., which is based on a 5 years old survey, and may not reflect tool and
user reality anymore. The second counter argument was that it will be easier
for the end user to read; but at the same time comment #2 above justifies
keeping "is" in relation names by saying "it's ok, we'll anyway need to
preprocess for end user", so it seems like that point is moot.
Would one option be to poll the BFO community and decide based on preference?
Original comment by mcour...@gmail.com
on 21 May 2012 at 3:16
re: "Would one option be to poll the BFO community and decide based on
preference?" - please file procedural issues separately as Type-BFO2-Process.
Re: other comments, seems like we are going in circles. The question isn't
whether, but how. One proposal has uniformity in editor preferred label and
legacy in an alternative term, and the other the other way around.
Original comment by alanruttenberg@gmail.com
on 21 May 2012 at 3:56
Procedural issue created at http://code.google.com/p/bfo/issues/detail?id=34
Re circles, both options are not equivalent. I was suggesting we use rdfs:label
with value part_of, while you are suggesting to have an rdfs:label with value
is part of, a new annotation property with value part_of, and/or require some
development from the protege team. If you disagree, it would be helpful if you
could provide an example of both cases as you see them, with all label related
annotation properties, to make sure we work with the same basis.
Original comment by mcour...@gmail.com
on 21 May 2012 at 4:30
This issue has nothing to do with Protege. I'm arguing that the current
typographical convention is good *because* it helps distinguish relations from
English words/phrases with the same spelling. The 'is' makes some Manchester
Syntax expressions less readable and the also makes at least some references to
relations in free text extremely clunky.
A poll of OBO foundry + BFO folks seems a reasonable way to resolve if we can't
resolve here.
Original comment by dosu...@gmail.com
on 21 May 2012 at 4:21
+1 for using underscores
neutral regarding the use of "is"
However, this discussions should not delay any further the release of a BFO2OWL
trial version.
Once such a trial version can be tested by the user community,we will have a
more complete picture about the range of opinions and can take a more informed
decision.
Original comment by steschu@gmail.com
on 25 May 2012 at 1:48
Original issue reported on code.google.com by
alanruttenberg@gmail.com
on 20 May 2012 at 4:14