Open GoogleCodeExporter opened 9 years ago
I think this is done, right? Can this be closed?
Original comment by carla...@gmail.com
on 16 Apr 2009 at 6:15
How about we just remove the MVZ, switch some labels, and leave it open? Higher
Geog
is still a mess.
Prov. is prevalent, as in Africa, Angola, Benguela Prov.
Same for Dept. and Depto, as in Africa, Ivory Coast, Dept. Abidjan or Central
America, El Salvador, Depto. Cabanas
There is still lots of crazy not-geography "averaging":
Africa, Cameroon, Nord Prov.
Africa, Cameroon, Nord-Ouest Prov.
Africa, Cameroon, Ouest Prov.
There are (perhaps appropriately?) African Counties. This one caught my eye
because
of the slashie: Africa, Guinea, B/kama County
I don't know what this is:
Central America, Honduras, Depto. Islas de la Bahia, Islas de la Bahia
Central America, Honduras, Depto. Islas de la Bahia, Islas de la Bahia, Isla de
Roatan
Or this: South America, Chile, Metropolitan Region (=Region Metropolitana de
Santiago)
South America, Chile, Region I (=Region de Tarapaca)
South America, Chile, Region II (=Region de Antofagasta)
South America, Chile, Region III (=Region de Atacama)
South America, Chile, Region IV (=Region de Coquimbo)
South America, Chile, Region IX (=Region de la Araucania)
Don't forget the non-geological continents (along with some political changes,
which
we have no actual way of handling):
Eurasia, Russia
Eurasia, U.S.S.R.
....and so on and so forth.
OK, one more, but an easy one:
no higher geography recorded
no specific locality
unknown
Original comment by dust...@gmail.com
on 17 Apr 2009 at 12:11
agreed, still needs mega work. this is a better representation of the issue.
thanks.
Original comment by carla...@gmail.com
on 17 Apr 2009 at 12:29
A noble cause, but how will we know when this issue is closed? For years, I put
lots
of effort into cleaning UAM's higher geography. More inconsistencies (at best)
keep
getting added. I want to sweep lots of this legacy noise into
verbatim_locality, put
serious effort into georeferencing, and abandon string matches against hopeless
vocabulary ASAP. Some of this is original data, but, in the quest for
consistent (and
even particular bureaucratic) search criteria, an unknown amount is interpreted
after
the fact.
Original comment by gordon.jarrell
on 17 Apr 2009 at 5:31
This is one of those things that we as a community need to prioritize. Is it
worth
trying to clean up what we have, or cleaning up the obvious parts (the {abbr.}
bits
would be fairly easy to get), or ignoring this altogether in the hope that
we'll have
a locality service in the future?
Perhaps we should revisit who's allowed to alter geography - these things didn't
magic themselves in. Current users with manage_geography are:
uam> select GRANTEE from DBA_ROLE_PRIVS where GRANTED_ROLE='MANAGE_GEOGRAPHY';
GRANTEE
------------------------------
BRANDY
DLM
VOLEGUY
PDRUCKEN
ANDRES_LOPEZ
MKOO
PATTON
ATROX
JMALANEY
CCICERO
CINDY
GORDON
JLDUNNUM
LAM
TUCO
AHOPE
Original comment by dust...@gmail.com
on 17 Apr 2009 at 6:34
We will have some leverage on operators who require "Bureau of Land Managment
Soggy
Meadows Catepillar Refuge and Management Area" when we can tell them to tell
BLM to
give us GIS-shape files for their singular view of the planet's surface. In the
meantime, our users need what they need, or they need what they think they
need.
I see names there I'd love to subtract, but there would be at least hard
feelings.
On the other hand, with the addition of MVZ's relatively cosmopolitan records,
we
could start to stabilize as we approach global coverage.
Original comment by gordon.jarrell
on 17 Apr 2009 at 7:35
Then I propose eliminating everything except higher_geog from table
geog_auth_rec. We
should allow people anything they think they need, rather than allowing them
anything
they think they need as long as they can cram it into our arbitrary categories,
if
that is the goal. We're currently pretending to maintain some sort of authority
while
not actually doing so, and that confuses users and limits access to data. I
think
users and operators would be happier if we either dropped the pretenses. We're
demonstrably unable to maintain actual authority given the current table
structure.
Original comment by dust...@gmail.com
on 17 Apr 2009 at 7:55
Adding Social tag - AC needs to prioritize this.
Original comment by dust...@gmail.com
on 9 Feb 2010 at 1:37
I'm copying Michelle on this thread. In the absence of a locality service,
could we
make a link from the "Create Higher Geography" form
(http://arctos.database.museum/Locality.cfm?action=newHG) - also find
geography? - to
the document that Michelle put together for standardized names? I for one have
no
idea where to find that, and it would be useful for users entering new names,
and
also for cleaning up some of the messy ones. I know that the funky Chile names
are
our's, but not sure what those should be. Looking up Chile's subdivisions on
that doc
would be helpful.
Original comment by carla...@gmail.com
on 8 Mar 2010 at 10:22
We seem to have all lost interest in this, and it may not matter in light of
2012 locality changes. AC?
Original comment by dust...@gmail.com
on 3 Jul 2012 at 2:55
I haven't lost interest in it, but I might have despaired of it. We will still
be searching a lot of (maybe most) geography by string matches against strings
applied by a hodge-podge of operators, correct? This is not just Arctos's
problem. I would be willing to explore a proposal for development of a
community-wide fix.
Original comment by gordon.jarrell
on 5 Jul 2012 at 3:44
--We will still be searching a lot of (maybe most) geography by string matches
against strings applied by a hodge-podge of operators, correct?
Maybe. We could (pending the Google proposal) search against, or also against,
service-supplied strings now.
--This is not just Arctos's problem. I would be willing to explore a proposal
for development of a community-wide fix.
No idea what that means - who else can access Service data but has no
geospatial capability? Maybe geospatial capability is irrelevant - not really
sure. My only interest in higher geog is for use as a "standard" in my cleaning
service, and to be useful for that we need a singular assertion for any place
(eg, not with and without island_group, etc.).
Original comment by dust...@gmail.com
on 8 Jul 2012 at 3:46
Questions:
By "service-supplied strings," you mean your higher-geog service, or do you
mean an external service, like that Berkeley thing?
Don't get the "maybe." Yes or no woud be clearer.
"Google proposal" is fuzzy in my mind. I thought Link was asking Google to
extend access to maps. If there's more, it went by me.
- What I meant, and I'm not certain I'm correct, is that most or all other
higher-geographic queries on biodiversity data rely on string-matches to
essentially collector-supplied strings. Or in other words, the first line
of my message applies to more than Arctos. If so, a solution might be
sought as a supplement to VertNet, or be a stand-alone proposal on its own
merits.
- At one point, the singular-assertion standard would never have passed
political muster: different collections, and perhaps different disciplines
had (or still have) their own ideas about what their users want to match in
higher geography. We could push for a singular-assertion standard, but we
would need a huge clean-up of the legacy. And, even if there was agreement
in principle, the particulars could still inflame passion and become
protracted. On top of that, there are all the difficulties we've
experienced with standardizing other vocabularies; namely that the
vocabulary turns out to be unexpectly vague in the first place.
- My take on standardization is, been there, done that. It didn't even
approximately work. Shape-assigned tags are more scalable, if we have a
service to which we can comfortably add shapes, especially bureaucratic
constructions.
Original comment by gordon.jarrell
on 8 Jul 2012 at 6:19
service-supplied strings = data available from something like
http://maps.googleapis.com/maps/api/geocode/json?latlng=23,-82&sensor=false
maybe==it's a decision we need to make
the proposal is to extend our access to google services - it's been submitted
the nice thing about using service-supplied data for query is that the curators
can keep on doing whatever ridiculous thing makes them happy, but at the same
time we can give users tools with which to find specimens
not really that interested in solving problems for anyone else
I think controlling what's acceptable for geography is well within the mission
of the AC
I see no evidence that anyone's attempted to standardize anything about
geography, at least not with some firm goal in mind, and here's another place
where "do everything" ends up doing nothing. I have clear functional
requirements, and this is a tractable problem.
Original comment by dust...@gmail.com
on 8 Jul 2012 at 6:33
Okay. You wanted to close the issue, right? Okay by me. Wait and see
what falls out of everything else, then make new issues. Can't see what
googleapis does, but am curious to know if we can get or provide shapes
such as "Pacific International Fishing Zone IV," etc.
Original comment by gordon.jarrell
on 8 Jul 2012 at 9:16
The amended issue (geog==mess) is still valid even if the reasons for cleaning
up may have changed. I'm happy to keep this around if there's still a chance of
suckering someone into fixing the data.
Googleapis returns strings that, at various levels, describe a point. It tends
to be political - country, county, etc. You'll probably have to write your own
service to find your fishing zone.
Original comment by dust...@gmail.com
on 9 Jul 2012 at 5:17
Either way. The goofy data will be there as a reminder! If the AC agrees
that higher-geog strings with the same meaning can be fixed across
collections without row-by-row consultations, folks like me might slay them
as we find them. On the other hand, we could build a look-up/replace
spreadsheet to run against the whole mess.
The political designations are probably the easiest. Sounds like there
might be room for our own service, but I can wait and see.
Original comment by gordon.jarrell
on 9 Jul 2012 at 5:43
Here's my take:
Everything should be georeferenced, no matter how crudely/imprecisely.
Otherwise you're stuck with curatorial assertions, and those are mostly useless.
Once it's georeferenced, we can use the service to get standardized geography
strings. People who can't figure out how to draw a box on a map will use those.
The rest of us will draw boxes on maps when we want specimens from somewhere
special.
A body of unique higher descriptors can be used in a data cleaning service, and
those cleaned data can then be used for things like semi-automated
georeferencing.
Original comment by dust...@gmail.com
on 9 Jul 2012 at 5:50
We then have to reduce the many ways in which the coordinates may be
interpreted as a point location, and I fear there are many.
I'm more ambitious about shapes and their descriptive strings, but when we
get this far, we can get more specific.
I'm happy.
Original comment by gordon.jarrell
on 9 Jul 2012 at 6:36
That's one problem with the google service - knowing when to stop. It usually
gives you a street address, but it's hard to tell (as a computer - pretty
obvious to people) when you've got too much precision. The addition of an error
("show me names of things that this circle is entirely within") would be a huge
improvement. In the meantime, I suppose the real "collecting point" _could_ be
the street address, so I'm keeping it (potentially for search - now, it's doing
nothing).
Original comment by dust...@gmail.com
on 9 Jul 2012 at 6:42
Original issue reported on code.google.com by
dust...@gmail.com
on 23 Jan 2009 at 11:39