Closed GoogleCodeExporter closed 9 years ago
What do you mean by "repartition 'geospatial metadata'"?
Original comment by gordon.jarrell
on 17 Jun 2009 at 8:15
This entire issue is about repartitioning geospatial metadata. We should be
able to
find and share when-and-where data without needing to memorize random integers,
poke
holes in our security, or be fluent in historical placenames. We can't do that
with
the current structure.
We have a completely nonsensical mix of geospatial data (dec_lat), descriptive
data
(continent_ocean), untransformed data (lat_deg), and stuff about specimens
(collecting_method) scattered around our 4 spatiotemporal tables, and we can't
keep
track of the important parts without bringing along lots of stuff we don't want.
Original comment by dust...@gmail.com
on 17 Jun 2009 at 9:11
adding social label - this isn't getting fixed without a proposal & more
discussion
Original comment by dust...@gmail.com
on 24 Jul 2009 at 9:56
From an old email exchange...
Arctos has locality data in 4 tables:
geog_auth_rec
locality
collecting_event
lat_long
That structure, along with being difficult to write code to, doesn't jive very
well
with how data are collected and used. For example, there are times when it
would be
hugely beneficial to share everything about a collecting event between
specimens, but
where collecting method differs - rat caught in Museum Special, worm from that
rat
"caught" in 100-mesh sieve, etc. Since collecting_method is in
collecting_event, we
can't do that with one event, meaning we just doubled our chances of mucking up
the
event/locality connection.
Higher geog forces us to make arbitrary choices, and those end up being
taxon-dependent. I'm pretty sure you could use Arctos to prove that there are no
moose in the "Yukon-Tanana uplands" or no plants in "Game Management Unit 20,"
nevermind that those are largely the same thing. It's even goofier when you
consider
things like language (Russia or Россия?) or dynamic political boundaries
(maybe it
should really be Soviet Union - or maybe now Belarus!).
The way in which we store shapes (Point-Radius) is fairly primitive. The circles
around "New Mexico, 8000 feet" and "Yukon River, Alaska" both contain a LOT of
unlikely acres.
We have no way of spatially querying error. I have a fake spatial query widget
implemented, but it only considers points - irrespective of whether error is a
meter
or a light-year, and it returns an all-or-nothing result set.
Those aren't the only problems, but they're exemplary.
I think the solution is obvious, even if the details aren't: replace the
"coordinates
as an afterthought" model with a "coordinates as the data" model, and let
machines
figure out if a given georeference is in Russia, or what was once Russia, or
has a
chance of being in Russia, or just someplace I can see from here, or whatever.
(We
can still keep any number of strings describing the coordinates and of course
the
collector's notation, we just don't have to locate specimens using only those.)
I think the breakdown for storing data is roughly the following:
1)Geospatial data (shapes)
2)Attributes of geospatial data (determiner, reference, as_defined_on_date,
etc.)
3)Temporal collecting data
4)Event attributes (method, habitat description, etc.)
5)Verbatim assertions (Curator's assigned geography string, collector's locality
description, etc.)
6)Geologic data (Formation, Period, etc.)
Those need more consideration, and can be refined along the way.
It's not clear to me what we can do about storing the actual locality data - I
think
GIS shapes are straightforward now, and an improvement over our current
(point-radius) method. Probability Surfaces are better still, and at least
storing/serving them, if not creating them, may be immediately available.
Original comment by dust...@gmail.com
on 10 Mar 2010 at 8:39
Original comment by dust...@gmail.com
on 5 Aug 2011 at 7:06
Original comment by dust...@gmail.com
on 5 Aug 2011 at 7:13
Just trying to bring this back to the top of the pile. I'm not going to jump
all the way in by myself, and MSB Parasites has an immediate funded need to
separate collecting event (place and time) and collecting method (and source)
data. If ya'll don't help me come up with a viable model that I can write code
to like about now-ish I'm going to do something drastic - maybe flatten
locality data out into something like our "temporary" taxonomy structure....
Original comment by dust...@gmail.com
on 31 Oct 2011 at 4:01
This sounds like a major change that is best dealt with in person. Can we
organize an Arctos 'pow wow' ???
Original comment by carla...@gmail.com
on 1 Nov 2011 at 4:56
We've been ignoring this formally since Jun 11, 2009, and in various other
capacities since at least the ABQ powwow in 2006, when I think we all realized
that we needed to do things differently. This isn't something that's going to
be solved in an afternoon, so it isn't a good candidate for a meeting.
Original comment by dust...@gmail.com
on 1 Nov 2011 at 2:58
I recall talking about this at the ABQ meeting, or maybe a meeting at MVZ, and
agreeing that collecting method and source are better at level of cataloged
item rather than collecting event. Do you have the same recollection? What
exactly does MSB need to do that is different from how we do things now?
Original comment by carla...@gmail.com
on 2 Nov 2011 at 6:01
I'm not sure cataloged item is the right place for that stuff either, although
a bit of denormalization here wouldn't bother me too much and that's an easy
partial solution. But locality code is incredibly difficult to write and
maintain, so I'd really like a more complete solution.
MSB needs to collect hosts and parasites from the same event.
Original comment by dust...@gmail.com
on 2 Nov 2011 at 2:36
Original comment by dust...@gmail.com
on 31 Jan 2012 at 9:33
Original comment by dust...@gmail.com
on 26 Jun 2012 at 9:06
Original comment by dust...@gmail.com
on 26 Jun 2012 at 9:07
Original comment by dust...@gmail.com
on 28 Jun 2012 at 4:47
closing this thing for the psychological benefits - re-open with whatever we
miss in the v5.2 update
Original comment by dust...@gmail.com
on 2 Jul 2012 at 4:37
Original issue reported on code.google.com by
dust...@gmail.com
on 11 Jun 2009 at 9:02