ArctosDB / arctos

Arctos is a museum collections management system
https://arctos.database.museum
60 stars 13 forks source link

watershed #1273

Closed dustymc closed 6 years ago

dustymc commented 7 years ago

ref #1265 - this needs a dedicated discussion

From @amsnyder210:

Currently, "geography" for Arctos records is political (state, county, country, etc). Enabling users to query drainages in higher geography will facilitate the query process in that they do not have to know/guess county or state in order to pull fish records. The MSB fish collection locality records are assigned to US drainage using USGS watershed information via EPA site:

https://cfpub.epa.gov/surf/huc.cfm?huc_code=13020203

...which is accessed via the other authority I use for MSB fishes localities, USGS Geographic Names. https://geonames.usgs.gov/domestic/index.html

MSB fish records have two separate fields for query: Drainage and USGS HUC (Hydrologic Unit). Because we provide information for mostly New Mexico-based research, our locality records are assigned (narrowed) to 7 primary New Mexico drainages: Rio Grande, Pecos River, San Juan River, Gila River, Zuni River, Tularosa Basin, Guzman Basin. In this way, researchers can search on the broader USGS HUC value or the subdrainages for New Mexico.

For example, if someone needs MSB records from both Arizona and New Mexico Gila River and it's tributaries (of which there are several), they can search on Gila River Drainage or the Lower Colorado River Watershed HUC 150400 to pull those records. (Currently, in the ARCTOS public query mode there is no way to pull in all MSB fish records (or any fish record), which include tributaries.)

I would encourage Arctos to broaden it's scope to include aquatic records, which might also apply to aquatic mammals and birds.


Currently, "geography" for Arctos records is political (state, county, country, etc).

Not quite - it's possible to avoid politics, or to mix politics and geo, or probably lots of other things.

http://handbook.arctosdb.org/documentation/higher-geography.html

https://cfpub.epa.gov/surf/huc.cfm?huc_code=13020203

That does look like geography to me, whatever geography is.... The map (https://cfpub.epa.gov/surf/images/hucs/13020203.gif) makes me think there's spatial data involved - would it be possible to get the shapefiles (in WKT format)?

Perhaps watershed belongs in "feature" (http://handbook.arctosdb.org/documentation/higher-geography.html#geographic-feature) or could be stored there until we find the resources to do something better?

That would require somehow limiting the scope of "drainage" - "things in the USGS DB" might be a useful starting point. I'm resistant to adding "puddle feeding trickle feeding Neverheardofit Wash" to Geography - that might fit someone's idea of a "drainage," but I think those data are better associated with Locality.

(AWG: We need to have a similar discussion involving tiny islands. What exactly IS "geography worthy"?)

USGS HUC

Those data should be moved to a dedicated and linked Other ID Type.

broaden it's scope

"drainage" is also recorded on eg, AK sheep hunt reports. I certainly have no objections to adding the data, I just want to get it into the correct place in the model.

@mkoo "wkt's always gives us a spatial footprint that georeferenced specimens can be viewed in Arctos" - yes, WKT is visible as a polygon on specimen detail and it's intersection with the specimen georeference changes the color of the map border, for example.

@campmlc "WKTs also should be usable for other geography that transcends state lines" - yes, that (and counties, etc.) is a major use case for national parks and such. All geography is optional - "[continent]North America, [feature]Tijuana River Watershed" (spans countries) fits fine.

jtgiermakowski commented 7 years ago

howdy! This issue caught my interest as i try to clean up our geography data for upload. it is also relevant to some of our amphibian and aquatic turtle work. In the ARCTOS handbook "Geographic Feature" is defined as "Miscellaneous named and delineated entities below the level of state." If it's possible for it to span countries, or even continents, then it should be easy to implement by adding drainage information in that field. The issue I see is that there ought to be room for hierarchy, similar to the container model in ARCTOS. I can help with getting the WKT and drainage data from USGS.

dustymc commented 7 years ago

span countries hierarchy

I think those two things are generally incompatible - countries contain islands, islands contain multiple countries, islands ARE countries (and/or vise-versa), etc.

Our definition of Feature doesn't work for current data (and these data are also non-hierarchical). I think we just need a better definition.


Banff National Park
    British Columbia
    Alberta
Hovenweep National Monument
    Colorado
    Utah
John Day Fossil Beds National Monument
    Oregon
    Washington
Brown's Park National Wildlife Refuge
    Colorado
    Utah
Death Valley National Park
    Nevada
    California
Dinosaur National Monument
    Colorado
    Utah
Jegelewicz commented 7 years ago

I too have records with drainage data and they are not all fish! I have been placing the drainage designation in Locality Remarks and it would be great to have this information become part of a more meaningful search field. For me, this also brings up mountain ranges, especially in our mollusk data, where some species are endemic to single mountain ranges. I am unaware if there are range WKTs, but this could end up being a useful feature as well.

dustymc commented 7 years ago

Locality Remarks

Specific locality ("Rio Grande Drainage: normal specloc stuff...") is probably better from a searchability standpoint. I would expect to find in locality remarks things about the locality data ("spelled bla but we think they meant blah...") and not data itself.

ANYWAY, why not! So feature is:

We might also consider an "exclude string-only geography data" search flag, which would serve as a quality filter and (maybe) encourage "us" to dig up WKT.

Is this resolved, minus a better definition of Feature?

jtgiermakowski commented 7 years ago

I guess i was thinking on somehow linking countries to features, but that's a spatial querry implementation (something better accomplished in a PostGIS type environment). As is, features span counties, but are not linked to those either (because I see them as part of entries in higher geography). If records have coordinates, can queries be spatial? in other words, does a query for a specific county capture all features that INTERSECT that county? if so, then problem solved.

Next is hierarchy of features, which is just a reference to "part of", either "self" or "another feature". isn't this how containers work (parent vs. child)?

As far as Feature definition, maybe something like "Miscellaneous named and delineated entities that are described by non-geopolitical boundaries and have an associated extent in WKT"

The benefit of having drainage as features, for example, is the upstream/downstream searchability without knowledge of local rivers and/or counties (again, assuming that searches are spatial, not based on text).

dustymc commented 7 years ago

Features are "linked" to Counties (all geog "entities" to each other) only by appearing in the same row.

The container model is:

container_id PKEY
parent_container_id FKEY (container_id)
{metadata}

each container is unique and has exactly one parent - that's it. (With some cheating at the ultimate parent.)

The "parents" (continent_ocean) of France, for example, are...

West Indies
Indian Ocean
Antarctica
South America
Europe
Pacific Ocean

because France isn't geographic - it's political. (There's a map at http://handbook.arctosdb.org/documentation/higher-geography.html#country) So I suppose we could have lots of "Frances" or we could have a 1-->many relationship "uphill" or something, but that's not really hierarchical.

Sorta related, there are many specimens where we don't know the shape. Kenya seems to reorganize internally every decade or so, and they recycle names when they do. We just know "Whatever, Kenya." To make that spatial we'd need a bunch of time-linked shapes under "Whatever, Kenya" - "give me all the rats from Whatever, Kenya as it was spatially defined on 1954-01-17." We could build that, but I wouldn't much want to try to use it - I'd rather just georeference everything and plug into a service. (A few of us have been fantasizing about "LocalityBank" for a decade or so now, and Arctos is built to use that sort of thing if it ever happens.)

There are limited spatial capabilities - from edit geog you can find specimens claiming to be from the geography but not intersecting with the polygon, for example. (And that ignores a bunch of data - locality WKTs and error radii.) That functionality could be extended a bit, but most geography is not supported by shapes and we don't have true GIS capability - I can fake it for a while (mostly by exploiting Google's APIs), but there are definitely technical limitations under the current infrastructure.

jtgiermakowski commented 7 years ago

OK, container model seems straight forward and France is a good example. I also understand the time component of the geopolitical boundaries. seems that GIS capability is best saved for 2.0

now, as far as Features, i can add higher geography and specific locality but i can't create features. Is there a bulkloader for geography? should i just send you a CSV file with names and WKT to add to features? in Higher Geography, the Feature field is a drop-down.

dustymc commented 7 years ago

I'm happy to create whatever, or we can talk to the AWG about code table access. I've found it's generally worthwhile to enter them manually - there are a LOT of ways of describing most places, it's been impossible to make that consistent at scale, and past bulkloads have caused a lot of duplicates which take a lot of time to resolve when they're eventually identified as such. Adding search terms has also been extremely useful in cleanup (and finding specimens), and I'm not sure how those would bulkload. WKT is weird as well - something in the software stack doesn't like large text files, it's usually easiest to create them as Media (we can get you SCP to TACC if that's useful); geography knows how to deal with remote WKT, just needs media_id to do so.

dustymc commented 6 years ago

Hopeful summary:

1) Is Feature an appropriate container for "watershed"? 2) How should Feature be defined? 3) Is there reason to allow https://cfpub.epa.gov/surf/huc.cfm as geography authority (eg, anything critical there that's not mirrored in wikipedia?)

jtgiermakowski commented 6 years ago

Howdy, Lex (@amsnyder210) and I got together this afternoon and discussed this in more detail. Here's what we think is reasonable:

  1. Yes, for now. Ideally, there would be a field called "Drainage basin" (following Wikipedia, see https://en.wikipedia.org/wiki/Drainage_basin) within "Higher Geography" and which for United States follows USGS terminology for water resources, much like there is currently in ARCTOS a field for "Map Name (Quads)". Thus, for example, "Rio Grande Region", as defined, is

the drainage within the United States of: (a) the Rio Grande Basin, and (b) the San Luis Valley, North Plains, Plains of San Agustin, Mimbres River, Estancia, Jornada Del Muerto, Tularosa Valley, Salt Basin, and Other Closed Basins. Includes parts of Colorado, New Mexico, and Texas.

  1. The definition of "Feature" should include drainage basin information (in case of the US, based on USGS Water Resources of the United States):

Features include named entities such as protected areas (parks, preserves, refuges), water features and other delineated geographical or political features. Feature may also be used to describe recognized sub-groups of islands. Many administrative units included in Feature have ephemeral boundaries, if not an ephemeral existence. Their past and future use may be inconsistent. Therefore, avoid using Feature if the locality is well georeferenced and/or unequivocal in the absence of Feature.

  1. No, no reason to use EPA. Drainage basins should be based on USGS (for US) and only at the higher levels (Boundary Descriptions and Names of Regions, Subregions, and Basins), as described in this source: https://water.usgs.gov/GIS/huc_name.html Wikipedia (https://en.wikipedia.org/wiki/Hydrological_code) has a table that lists these levels and we should only include levels 1,2, and 3 in ARCTOS. Just as we don't yet have all the NPS units in features, we can upload only the ones that represent cataloged information. HUC codes can go in the Other ID, as discussed previously.

I've got the files with WKT for the level 1, 2, and 3 basins and can upload them as media to ARCTOS, or host a few if we need to give it a test drive.

campmlc commented 6 years ago

I support this solution and propose for resolution during the Geography-related issues in Arctos special topics discussion this Thursday.

dustymc commented 6 years ago

Thanks @jtgiermakowski and @amsnyder210!

Yes "for now" is I think a worthy goal. We've been trying unsuccessfully to do SOMETHING for a long time, this looks like a very useful step to me.

ephemeral existence

That describes almost all geography (except perhaps quads), and has driven much of http://handbook.arctosdb.org/documentation/higher-geography.html#guidelines-for-geographic-terms-in-arctos.

georeferenced

Given that, I can (and do) pull "supplemental geography" from webservices (and consider it with the "any geog" search field). It's not terribly GOOD, but it is consistent (and has lead to a fair number of errors, which is useful too). I can't possible overstress how important georeferences are, nor how important documenting them is (eg, did the coordinates use the transcription error in geography/locality/whatever, or were they pulled from a GPS and the transcription error added independently?)

avoid using Feature

Despite the last two points, I think the various geography terms are still useful because users search on them. If you are going to eliminate something that a user might expect to find/use in the data, I encourage you to document that.

HUC codes

I need two/three things to set that up:

1) ID type ("USGS HUC Code"??) 2) Definition ("USGS Hydrologic Unit Map Code; see https://water.usgs.gov/GIS/huc_name.html for more information" ??) 3) Optionally, a base URL (which when prepended to the identifier value would form a URL)

Base URL

https://cfpub.epa.gov/surf/huc.cfm?huc_code=

plus "13020203" for the ID value would lead to https://cfpub.epa.gov/surf/huc.cfm?huc_code=13020203, for example. (And I'm not suggesting that's the valule we SHOULD use!)

WKT

Do you have an account with TACC? If not I believe @ccicero can set you up. You can just SCP (or I think there are other options) the files to Corral, hosting them shouldn't be a problem.

special topics discussion

More eyeballs are always very welcome, but I think this is enough to just proceed as well - it only involves a definition rewrite and perhaps an additional geography source (I'm still not completely clear on that). We'll probably need to convene the AWG group before adding a dedicated field to geography, but once this has some data there should be few surprises in that move.

dustymc commented 6 years ago

From geography meeting

Watershed=hierarchical?!? Do we need to deal with it similar to geology?

Geology is on Locality - should watershed be as well?

amsnyder210 commented 6 years ago

Yes.

On Thu, Dec 14, 2017 at 3:53 PM, dustymc notifications@github.com wrote:

From geography meeting

Watershed=hierarchical?!? Do we need to deal with it similar to geology?

Geology is on Locality - should watershed be as well?

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/ArctosDB/arctos/issues/1273#issuecomment-351860923, or mute the thread https://github.com/notifications/unsubscribe-auth/Aee7_u4FR1bFvr_g7yPgd19srJsG5OIwks5tAabggaJpZM4PbO6r .

--


Alexandra M Snyder Collections Manager-Fishes Museum of Southwestern Biology MSC01-2020 University of New Mexico Albuquerque NM 87131 USA

PH.505.277.6005 OFFICE

Physical address for FedEx and UPS CERIA 83 Room 204 302 Yale Blvd NE

http://msb.unm.edu/

amsnyder210 commented 6 years ago
  1. USA and Caribbean Hydrologic Unit Codes (HUC) for inclusion in Arctos: would strongly suggest that the group decide on the level of classification, 1st to 8th. Right now, MSB fishes assigns the 8-digit or 4th level cataloging unit for locality records. This may be too refined for application in Arctos. I have attached a one page overview for consideration of different levels. Perhaps "accounting units" or one step above MSB fishes HUCs, 3rd level.
  2. DRAINAGE: Would strongly suggest that a field called "Drainage" be included in Arctos Locality drop down list, along with "Islands" or "Sea" or so on. Drainage should be considered a primary field in freshwater ichthyofaunal records. (But also aquatic invertebrates like mussels, amphibians, etc.)

Ichthyologists, aquatic ecologists, agency biologists, etc. will (typically) search on either of these two fields (HUC or Drainage) to access information on species composition (trends and presence/absence), habitat, etc. This is what these researchers expect to find when opening a query or when I receive requests for data. The Arctos form should have a visible field for researchers on which to build a query rather than having them guess which general text field would produce maximum records.

Watersheds, HUC, Assessments.pdf

amsnyder210 commented 6 years ago

Let's prioritize this for resolution on Thursday's AWG meeting.

campmlc commented 6 years ago

On Wed, Apr 11, 2018 at 11:14 AM, Mariel Campbell campbell@carachupa.org wrote: I don't know if the HUC model is created as hierarchical, but drainages certainly are, for example, the Ohio River Drainage is part of the Mississippi River Drainage; the San Juan is a tributary of the Colorado, and each of these of course is fed by countless smaller rivers and creeks.

I think we should at least proceed with the following: " precariously hang a "watershed" and "drainage" column off the side of something, and wait for user feedback."

I'm guessing on the "something": could be Collecting Event, since that includes verbatim information, and we can assign multiple collecting events -update if the drainage changes due to re-analysis, or massive flood or earthquake?

Dusty: If that works, perhaps we could do it slightly less precariously and address #1270, ecological parameters, at the same time. That model is:

collecting_event_id=1; term=temperature;value=-50; units=F collecting_event_id=1; term=watershed;value=type something here or maybe select from a code table; units=NULL collecting_event_id=1; ...

Along with the stuff you mentioned, that would deal with "The list is not current for Alaska." from https://water.usgs.gov/GIS/huc_name.html - the collecting event dates would be implicitly included.

And "select from a code table" could be hierarchical, much like lithography data - a search for "Mississippi River Drainage" could find things recorded as "Ohio River Drainage." Implementing that would not change how "collecting event attributes" work.

Thoughts?

dustymc commented 6 years ago

moved to https://github.com/ArctosDB/arctos/issues/1270

Jegelewicz commented 6 years ago

This sounds great to me! In the UTEP ant data there are the following which seem to fit here too:

Relative humidity % overcast Wind speed Wind direction Air temperature Ground temperature Soil texture Soil color Soil moisture Soil organic content

These all seem to me to be an extension of "habitat".....

dustymc commented 6 years ago

habitat

Interesting point. That's currently in specimen_event because we didn't have a better place for things that don't necessarily apply to an entire event but also aren't quite attributes of specimens (eg, when the event is "where I walked today" and habitat is "under a log") . Moving that to the new structure would be about the same assertion, but also allows (but doesn't demand) multiple "scales" of habitat (something that's been a minor issue for a long time):

Everything else you mentioned seems straightforward and should fit no problem.

amsnyder210 commented 6 years ago

Some of us see drainage as more of a geographical constraint not part of a "collecting event" (like place, time, date, gear, etc). Water pH, temperature, DO, etc. are water/habitat information that may be better placed in Arctos Locality scheme rather than mixed in with specimen specific information like "tarsus length" or "stomach contents."

As I have not made a convincing argument for inclusion of drainage and watershed values as geographical attributes guess I will go along with what is deemed easiest for all involved.

Thank you for your efforts. Lex

On Tue, Apr 17, 2018 at 12:20 PM, dustymc notifications@github.com wrote:

It seems like dealing with this as collecting_event_attributes could work, so here's an expansion of #1273 (comment) https://github.com/ArctosDB/arctos/issues/1273#issuecomment-380594090 for hopefully-final approval.

New table collecting_event_attributes

collecting_event_attribute_id number: PKEY collecting_event_id number not null: FKEY-->collecting_event.collecting_event_id number not null determined_by_agent_id number not null: FKEY-->agent.agent_id number not null (or should this be NULL?) event_attribute_type varchar2(255) NOT NULL: FKEY-->new table collecting_event_attribute_type event_attribute_value varchar2(4000) NOT NULL: conditionally controlled by triggers event_attribute_units varchar2(30) NULL: conditionally controlled by triggers event_attribute_remark varchar2(4000) NULL event_determination_method varchar2(4000) NULL event_determined_date ISO8601 NULL

New table analogous to http://arctos.database.museum/ info/ctDocumentation.cfm?table=CTATTRIBUTE_TYPE (controls event_attribute_type)

Possible values:

  • drainage
  • watershed
  • temperature
  • pH

(These can be added via authorities; adding them requires no code changes.)

New table analogous to http://arctos.database.museum/ info/ctDocumentation.cfm?table=CTATTRIBUTE_CODE_TABLES

  • value in value_code_table-->event_attribute_value is code-table controlled
  • value in unit_code_table-->event_attribute_value is numeric, event_attribute_units is code-table controlled
  • no entry-->event_attribute_units must be NULL, event_attribute_value=free-text

Examples: term units value result drainage NULL ctdrainage values are restricted to a (new) authority file temperature CTTEMPERATURE_UNITS NULL values are numeric, units are restricted to http://arctos.database.museum/info/ctDocumentation.cfm? table=CTTEMPERATURE_UNITS

From those data + "Possible values" above, watershed and pH (not included in the table) would be free-text and unitless. (None of this is a recommendation for data values or controls, it's just an attempt at describing functionality.)

Example (partial) data (given the above). This could all be attached to one collecting_event (which could in turn be attached to any number of specimens, media, etc.) type value units date determiner method explanation drainage something NULL today me NULL current determination drainage somethingelse NULL yesterday me NULL previous determination drainage somethingelse NULL yesterday you NULL consensus determination temperature 5 kelvin last week me felt REALLY cold determination, "warts and all" (eg, there's no particular reason to reject unlikely data temperature 5 celsius last week you likely interpretation of unlikely Kelvin data Curatorial "corrections" fit and can be documented in the same way as everything else temperature 5.600000001 celsius {YYYY-MM-DDTHH:MM:SSZ} NOAA weather station retrieved via API These data don't necessarily have to come from any particular source (eg, collectors) and don't have to perfectly coincide with the event (it may span a day, temperature readings may come with second-precision)

1294 https://github.com/ArctosDB/arctos/issues/1294 is probably a soft

blocker (I don't think it needs to STOP this, but it should share priority): these data could require hundreds of "columns" and so may need an asynchronous/normalized entry option.

Note that the structure of the data does not control the appearance of the UI - we could allow users to search "drainage" from the Locality pane and "temperature" from {wherever} and restrict pH searches to the Curatorial pane or WHATEVER.

Things in here SHOULD be related to place-at-time (collecting events), but there's some flexibility in that as well. And nothing here changes the flexibility inherent to Places (which can be defined as "THAT cubic millimeter" or "somewhere near the planet, probably") or Events (Locality plus somewhere between "THAT second" and "sometime in the last 4 billion years, probably"). I think that makes sense for these data - the "The list is not current for Alaska." comment at https://water.usgs.gov/GIS/ huc_name.html suggests these data are best represented as "{drainage} (as defined by SOURCE on DATE)" rather than "{drainage} (an attribute of a place which mostly doesn't change over human timescales, even if we occasionally refine names and boundaries)" as geology data are modeled.

Does this still seem like a viable solution, is there anything I'm not yet understanding, etc?

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/ArctosDB/arctos/issues/1273#issuecomment-382092450, or mute the thread https://github.com/notifications/unsubscribe-auth/Aee7_re9bzXKXDrTVqzgnvkaYi_E4c3Bks5tpjJpgaJpZM4PbO6r .

--


Alexandra M Snyder Collections Manager-Fishes Museum of Southwestern Biology MSC01-2020 University of New Mexico Albuquerque NM 87131 USA

PH.505.277.6005 OFFICE

Physical address for FedEx and UPS CERIA 83 Room 204 302 Yale Blvd NE

http://msb.unm.edu/

dustymc commented 6 years ago

Thanks Lex,

I don't think it's a matter of easiest, but of getting the model correct. There are certainly things we can't support without additional resources, but I think everything that's been discussed so far regarding drainage (and "event attributes" if those turn out to not be the same problem after all) is immediately possible.

If Drainage is hierarchical (see comments above), then I'm struggling to see how it could be incorporated into geography. That of course doesn't mean it can't, I'm just not fully there yet.

It's also possible to record "search terms" with geography. They're not exactly hierarchical, but serve roughly the same purpose in relation to specimens. See for example the "SrchTerm" column at http://arctos.database.museum/geography.cfm?higher_geog=Asia,%20Iran. So a geography record might include search terms...

and searching for any of those things will get users to the specimens which use the geography record to which they're attached. For example, https://arctos.database.museum/SpecimenResults.cfm?any_geog=%D8%AC%D9%85%D9%87%D9%88%D8%B1%DB%8C%20%D8%A7%D8%B3%D9%84%D8%A7%D9%85%DB%8C%20%D8%A7%DB%8C%D8%B1%D8%A7%D9%86 should find a bunch of specimens from "Asia, Iran" because...

screen shot 2018-04-17 at 1 10 17 pm

... جمهوری اسلامی ایران is linked to the geography which is linked to those specimens.

If I've misunderstood something and drainage is just a term (similar to continent, state, county, etc.), or if geography search terms are sufficient to treat it as such, then it can fit into geography (and should be pretty simple to add as a new concept).

If Drainage is a Feature (also as discussed above) then this is an even simpler problem, and I can have that running in minutes.

amsnyder210 commented 6 years ago

Dusty, To respond to your last statement: drainage as a term (similar to continent, state, county) my response is yes. Drainage is an encompassing attribute like continent. A continent can include many countries, so for example, if I search on Africa (continent) cichlidae (the taxa), I will get records from all Arctos museums that have cichlid fishes from a variety of African (east and west waterbodies) countries. Drainage is assigned (by museum collection) to different rivers and streams or "tributaries." If I search on Mississippi River (which flows through 10 US states) for Catostomus spp., I will get records from all Arctos museums that have suckers from all US states that have tributaries draining into or out of the Mississippi River. It is a more complete set of records than if I search on a state or county...or can remember all states in the Mississippi River Basin.

If an Arctos fish collection gives Arctos a drainage designation for their records, as the MSB fishes did, then I suggest that it would be included (in a separate field) in "Higher" or "Any" in "geography." In otherwords, it could be up to Arctos to accept (or not) drainage designations from contributing fish collections to Arctos but there will be a field that researchers can search on that makes sense for aquatic animals like fishes. If the drainage designation does not encompass enough fish records effectively for the research question, then HUC (Watershed) Value can be searched on (if provided by the fish collection).

I have successfully employed both fields (drainage and HUC) in organizing MSB fish records and can be confident that the queries I run are giving me complete results. Again, I am asked to run queries for researchers with the US Forest Service, US Fish and Wildlife, US Bureau of Reclamation, the NM Environment Department, The Nature Conservancy, etc. These have to be complete data sets because land use policy, biological opinions, etc. can be derived directly or indirectly from the results of these queries.

My hope was that I could direct researchers to Arctos for these fish queries. I now assume that using the map query may be what researchers could use for now? Lex

On Tue, Apr 17, 2018 at 2:24 PM, dustymc notifications@github.com wrote:

Thanks Lex,

I don't think it's a matter of easiest, but of getting the model correct. There are certainly things we can't support without additional resources, but I think everything that's been discussed so far regarding drainage (and "event attributes" if those turn out to not be the same problem after all) is immediately possible.

If Drainage is hierarchical (see comments above), then I'm struggling to see how it could be incorporated into geography. That of course doesn't mean it can't, I'm just not fully there yet.

It's also possible to record "search terms" with geography. They're not exactly hierarchical, but serve roughly the same purpose in relation to specimens. See for example the "SrchTerm" column at http://arctos.database.museum/geography.cfm?higher_geog=Asia,%20Iran. So a geography record might include search terms...

  • tiny creek
  • small river
  • big river

and searching for any of those things will get users to the specimens which use the geography record to which they're attached. For example, https://arctos.database.museum/SpecimenResults.cfm? any_geog=%D8%AC%D9%85%D9%87%D9%88%D8%B1%DB%8C%20%D8%A7%D8% B3%D9%84%D8%A7%D9%85%DB%8C%20%D8%A7%DB%8C%D8%B1%D8%A7%D9%86 should find a bunch of specimens from "Asia, Iran" because...

[image: screen shot 2018-04-17 at 1 10 17 pm] https://user-images.githubusercontent.com/5720791/38893962-b7299162-4240-11e8-90d0-733b88baf7ec.png

... جمهوری اسلامی ایران is linked to the geography which is linked to those specimens.

If I've misunderstood something and drainage is just a term (similar to continent, state, county, etc.), or if geography search terms are sufficient to treat it as such, then it can fit into geography (and should be pretty simple to add as a new concept).

If Drainage is a Feature (also as discussed above) then this is an even simpler problem, and I can have that running in minutes.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/ArctosDB/arctos/issues/1273#issuecomment-382130110, or mute the thread https://github.com/notifications/unsubscribe-auth/Aee7_jOiRFDaqpbH6BH9B5VJu42U31XVks5tpk-SgaJpZM4PbO6r .

--


Alexandra M Snyder Collections Manager-Fishes Museum of Southwestern Biology MSC01-2020 University of New Mexico Albuquerque NM 87131 USA

PH.505.277.6005 OFFICE

Physical address for FedEx and UPS CERIA 83 Room 204 302 Yale Blvd NE

http://msb.unm.edu/

dustymc commented 6 years ago

Thanks again.

Given a specimen from, say, "Cimarron River" (spans a few states, is a tributary of the Arkansas, which in turn is a tributary of the Mississippi) what information would you ideally record?

What about "Cimarron River, New Mexico"? (E.g., in the current Arctos Geography model that would define a corner of NM - is that how you visualize these data, or are "Cimarron" and "NM" more independent concepts, or ??)

amsnyder210 commented 6 years ago

For MSB Fishes, the New Mexico records from the Cimarron River and Dry Cimarron River (and their tributaries) are assigned to the Arkansas River Basin (Region 11) and not the Canadian River (in New Mexico Subregion 1108), which I would do if these rivers were not so confined to the extreme northeast (small corner) of NM. So, in order to pick up these NM records plus the other states (Oklahoma, Colorado) in queries (of all museum records) I assigned them to Arkansas River Basin.

I view rivers and states as independent concepts, if I understand the question. Again, aquatic vertebrates and invertebrates traverse states via waterways. I want to know where they occur in water systems not states, counties or such.

Thank you. Lex

On Tue, Apr 17, 2018 at 3:39 PM, dustymc notifications@github.com wrote:

Thanks again.

Given a specimen from, say, "Cimarron River" (spans a few states, is a tributary of the Arkansas, which in turn is a tributary of the Mississippi) what information would you ideally record?

What about "Cimarron River, New Mexico"? (E.g., in the current Arctos Geography model that would define a corner of NM - is that how you visualize these data, or are "Cimarron" and "NM" more independent concepts, or ??)

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/ArctosDB/arctos/issues/1273#issuecomment-382156828, or mute the thread https://github.com/notifications/unsubscribe-auth/Aee7_rCQvazoASQyPwuNPZ2jautTpydaks5tpmElgaJpZM4PbO6r .

--


Alexandra M Snyder Collections Manager-Fishes Museum of Southwestern Biology MSC01-2020 University of New Mexico Albuquerque NM 87131 USA

PH.505.277.6005 OFFICE

Physical address for FedEx and UPS CERIA 83 Room 204 302 Yale Blvd NE

http://msb.unm.edu/

dustymc commented 6 years ago

I was under the impression that you'd want to record "Cimarron River" and find those (and a BUNCH of other stuff) by querying "Mississippi." If that's not the case then we probably don't need hierarchies. If that is the case, we might be able to cheat eg, by adding "Mississippi" and "Arkansas" and anything else which includes "Cimarron" as search terms of "Cimarron." (And again I'm not suggesting data and have no real interest in actual values, I'm just trying to understand the structure necessary to support what you want to do with your data.)

My second question was in reference to the fact that our geography entries are (spatially) the intersection of their components.

I'm just trying to understand if there's some reason that Drainages should or should not be in that model. I don't think it's a problem, but please do let me know if you see an issue there.

That brings up another question: Do your data include things currently in Feature? (Parks and such, but some vaguely-defined areas have slipped in - http://arctos.database.museum/info/ctDocumentation.cfm?table=CTFEATURE). I suspect you do have those data? If so, using Feature for Drainage won't work because there's one Feature available per geography record.

I think there may have also been some concerns on how that would look in the interfaces as well, but I don't see a problem there - we can label and query Drainage separately without much problem, wherever it's ultimately stored.

Thank you!

amsnyder210 commented 6 years ago

Re: first statement: not sure how to handle structure, per se. If MSB Fishes drainage data fields (uploaded to Arctos) are currently populated with 7 subregion HUC-12 watersheds/drainages (i.e., Rio Grande, Gila River, Pecos River, Canadian River, Tularosa Basin, Guzman Basin, Zuni River, and Arkansas River) querying a drainage at a higher level like Mississippi River (2-digit first level region) coupled with "New Mexico" would give you all of New Mexico fish records. That would not be a satisfactory result if wanted a list of fish species (for example) from the Canadian River Basin (which covers New Mexico, Texas and Oklahoma). And, not just Canadian River mainstem fauna, but all the streams and creeks feeding into the Canadian River that our collection holds. Using the mapping capability is definitely the option for running river basin queries.... even tho' the map does not show river systems (or rivers-source to mouth) and researchers running the queries may not know what states these basins traverse, I am not terribly concerned about that issue but if there is some way to confidently get all fish records from river basins, that would be good. The bottomline, of course, is that all museum fish collections connect their records to a drainage but if there are collections currently with drainage designations, I would incorporate those for now...regardless of how specific (subregion) or general (region).

Your idea that National Parks as a model for drainages is closer to reality than any other terrestrial models, I think. The "intersection of components" i.e., US states (or any political entity) joined by a shared system is closer to the reality of drainage.

Third: Do MSB Fish data include items currently available in "Features?" Yes. I have a field for MSB Fishes called "Locality_Designation" to identify park lands (state and federal), tribal lands, BLM, etc. for encumbering data (tribal) and for reporting purposes (federal park land). I have not been a proponent of using the Arctos "Feature" field to capture "drainage/watershed" designation. Rather, I would still argue for it to be in a field that is connected to geography.

Thank you for considering these responses. Lex

On Tue, Apr 17, 2018 at 7:49 PM, dustymc notifications@github.com wrote:

I was under the impression that you'd want to record "Cimarron River" and find those (and a BUNCH of other stuff) by querying "Mississippi." If that's not the case then we probably don't need hierarchies. If that is the case, we might be able to cheat eg, by adding "Mississippi" and "Arkansas" and anything else which includes "Cimarron" as search terms of "Cimarron." (And again I'm not suggesting data and have no real interest in actual values, I'm just trying to understand the structure necessary to support what you want to do with your data.)

My second question was in reference to the fact that our geography entries are (spatially) the intersection of their components.

  • US, Yellowstone NP refers to bits of three states (and I don't think we have any data like that, although I also don't see a reason we couldn't)
  • US, Idaho, Yellowstone refers to a little sliver of the park
  • US, Idaho, Some County, Yellowstone might refer to a couple km^2 of the park
  • US, Idaho, Some County, Some USGS Quad, Yellowstone might refer to m^2

I'm just trying to understand if there's some reason that Drainages should or should not be in that model. I don't think it's a problem, but please do let me know if you see an issue there.

That brings up another question: Do your data include things currently in Feature? (Parks and such, but some vaguely-defined areas have slipped in - http://arctos.database.museum/info/ctDocumentation.cfm?table=CTFEATURE). I suspect you do have those data? If so, using Feature for Drainage won't work because there's one Feature available per geography record.

I think there may have also been some concerns on how that would look in the interfaces as well, but I don't see a problem there - we can label and query Drainage separately without much problem, wherever it's ultimately stored.

Thank you!

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/ArctosDB/arctos/issues/1273#issuecomment-382218682, or mute the thread https://github.com/notifications/unsubscribe-auth/Aee7_lmBVvJGQN72-BJtqUygb3kk2WKJks5tppucgaJpZM4PbO6r .

--


Alexandra M Snyder Collections Manager-Fishes Museum of Southwestern Biology MSC01-2020 University of New Mexico Albuquerque NM 87131 USA

PH.505.277.6005 OFFICE

Physical address for FedEx and UPS CERIA 83 Room 204 302 Yale Blvd NE

http://msb.unm.edu/

dustymc commented 6 years ago

Thanks once again. I think we've come full circle and are back to adding a new field "drainage" to geog_auth_rec, and perhaps using geo_search_terms (they're always optional) to simulate some hierarchy for "parent drainages."

It's always possible to query by string-components, but they sometimes aren't what you might expect (see eg, our map of France at http://handbook.arctosdb.org/documentation/higher-geography.html#country). That may not much matter for Drainage data. Given...

US, NM, Canadian River Drainage

US, OK, Canadian River Drainage

(etc.)

then

Does that sound reasonable/should we proceed?

And FYI the relationship between geography and ownership is imperfect - specimens collected before a Park was established will be incorrectly included, specimens with point-coordinates in a Park and error extending out may or may not actually "belong to" the Park, etc. https://github.com/ArctosDB/arctos/issues/1429 should eventually document an alternative approach. I think most everyone still uses geography to approximate NPS holdings and such, just be aware of the limitations.

amsnyder210 commented 6 years ago

Dusty, I think "any_geog" will work at this point. Worth trying. Thank you. Lex

On Wed, Apr 18, 2018 at 11:11 AM, dustymc notifications@github.com wrote:

Thanks once again. I think we've come full circle and are back to adding a new field "drainage" to geog_auth_rec, and perhaps using geo_search_terms (they're always optional) to simulate some hierarchy for "parent drainages."

It's always possible to query by string-components, but they sometimes aren't what you might expect (see eg, our map of France at http://handbook.arctosdb.org/documentation/higher-geography.html#country). That may not much matter for Drainage data. Given...

US, NM, Canadian River Drainage

  • search term: Mississippi Drainage

US, OK, Canadian River Drainage

  • search term: Mississippi Drainage

(etc.)

then

  • any_geog (which considers search_terms) = Mississippi Drainage finds specimens from all of OK and parts (I think??) of NM
  • any_geog = Mississippi Drainage + drainage = Canadian River Drainage finds specimens from parts of OK and NM
  • any_geog = Mississippi Drainage + drainage = Canadian River Drainage
  • state=NM finds specimens from parts of NM

Does that sound reasonable/should we proceed?

And FYI the relationship between geography and ownership is imperfect - specimens collected before a Park was established will be incorrectly included, specimens with point-coordinates in a Park and error extending out may or may not actually "belong to" the Park, etc. #1429 https://github.com/ArctosDB/arctos/issues/1429 should eventually document an alternative approach. I think most everyone still uses geography to approximate NPS holdings and such, just be aware of the limitations.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/ArctosDB/arctos/issues/1273#issuecomment-382460718, or mute the thread https://github.com/notifications/unsubscribe-auth/Aee7_iKcU0TbuFuhxQoxTjfdDVkjIsLUks5tp3PDgaJpZM4PbO6r .

--


Alexandra M Snyder Collections Manager-Fishes Museum of Southwestern Biology MSC01-2020 University of New Mexico Albuquerque NM 87131 USA

PH.505.277.6005 OFFICE

Physical address for FedEx and UPS CERIA 83 Room 204 302 Yale Blvd NE

http://msb.unm.edu/

campmlc commented 6 years ago

Dusty, this would be searchable under a separate "Drainage" field, correct?

On Wed, Apr 18, 2018 at 11:39 AM, amsnyder210 notifications@github.com wrote:

Dusty, I think "any_geog" will work at this point. Worth trying. Thank you. Lex

On Wed, Apr 18, 2018 at 11:11 AM, dustymc notifications@github.com wrote:

Thanks once again. I think we've come full circle and are back to adding a new field "drainage" to geog_auth_rec, and perhaps using geo_search_terms (they're always optional) to simulate some hierarchy for "parent drainages."

It's always possible to query by string-components, but they sometimes aren't what you might expect (see eg, our map of France at http://handbook.arctosdb.org/documentation/higher-geography.html#country ). That may not much matter for Drainage data. Given...

US, NM, Canadian River Drainage

  • search term: Mississippi Drainage

US, OK, Canadian River Drainage

  • search term: Mississippi Drainage

(etc.)

then

  • any_geog (which considers search_terms) = Mississippi Drainage finds specimens from all of OK and parts (I think??) of NM
  • any_geog = Mississippi Drainage + drainage = Canadian River Drainage finds specimens from parts of OK and NM
  • any_geog = Mississippi Drainage + drainage = Canadian River Drainage
  • state=NM finds specimens from parts of NM

Does that sound reasonable/should we proceed?

And FYI the relationship between geography and ownership is imperfect - specimens collected before a Park was established will be incorrectly included, specimens with point-coordinates in a Park and error extending out may or may not actually "belong to" the Park, etc. #1429 https://github.com/ArctosDB/arctos/issues/1429 should eventually document an alternative approach. I think most everyone still uses geography to approximate NPS holdings and such, just be aware of the limitations.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/ArctosDB/arctos/issues/1273#issuecomment-382460718, or mute the thread https://github.com/notifications/unsubscribe-auth/Aee7_ iKcU0TbuFuhxQoxTjfdDVkjIsLUks5tp3PDgaJpZM4PbO6r .

--


Alexandra M Snyder Collections Manager-Fishes Museum of Southwestern Biology MSC01-2020 University of New Mexico Albuquerque NM 87131 USA

PH.505.277.6005 OFFICE

Physical address for FedEx and UPS CERIA 83 Room 204 302 Yale Blvd NE https://maps.google.com/?q=302+Yale+Blvd+NE&entry=gmail&source=g

http://msb.unm.edu/

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/ArctosDB/arctos/issues/1273#issuecomment-382469441, or mute the thread https://github.com/notifications/unsubscribe-auth/AOH0hM0VdQYmL6jEXU0NYkzILFGT4SnFks5tp3pqgaJpZM4PbO6r .

dustymc commented 6 years ago

@campmlc this would work exactly like "island" or "continent" or anything else from geog_auth_rec.

amsnyder210 commented 6 years ago

Thanks again, Dusty. Lex

On Wed, Apr 18, 2018 at 12:34 PM, dustymc notifications@github.com wrote:

@campmlc https://github.com/campmlc this would work exactly like "island" or "continent" or anything else from geog_auth_rec.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/ArctosDB/arctos/issues/1273#issuecomment-382486082, or mute the thread https://github.com/notifications/unsubscribe-auth/Aee7_gGqeieVZOEuUy4eA82tsz_6BzqVks5tp4cbgaJpZM4PbO6r .

--


Alexandra M Snyder Collections Manager-Fishes Museum of Southwestern Biology MSC01-2020 University of New Mexico Albuquerque NM 87131 USA

PH.505.277.6005 OFFICE

Physical address for FedEx and UPS CERIA 83 Room 204 302 Yale Blvd NE

http://msb.unm.edu/

dustymc commented 6 years ago

I set this to next task. If anyone has objections or better ideas (to adding "drainage" to geog_auth_rec), please share them ASAP - I hope to start writing code next week.

dustymc commented 6 years ago

This is mostly implemented and can be tested at arctos-test.tacc.utexas.edu.

Specimen Search:

screen shot 2018-04-25 at 8 49 05 am

results

screen shot 2018-04-25 at 8 50 17 am

specimen detail

screen shot 2018-04-25 at 8 50 53 am

http://arctos-test.tacc.utexas.edu/guid/MVZ:Mamm:202690

edit geography

screen shot 2018-04-25 at 8 53 32 am

Feedback appreciated, but I don't think there's anything very surprising. (Except perhaps the data themselves, which should be expected in the test environment.)

I extracted unique higher_geog + locality remarks through the string "Drainage" from production - data attached. These could be created as new geography once this structure goes to production.

temp_msb_f_d.csv.zip

Distinct drainages from that:


DRAINAGE
------------------------------------------------------------------------------------------------------------------------
Gulf of Mexico Drainage
Arkansas River Drainage
Tularosa Basin Drainage
Guzman Basin Drainage
Canadian River Drainage
San Juan River Drainage
Pecos River Drainage
Colorado River Drainage
Fish Hatchery Drainage
Gila River Drainage
Pacific Ocean Slope Drainage
Great Basin Drainage
Arctic Ocean Drainage
Rio Grande Drainage
Mississippi River Drainage
Great Lakes Drainage
Zuni River Drainage
Atlantic Ocean Slope Drainage
Missouri River Drainage

19 rows selected.

A few preliminary questions:

I'm basically looking for rules and guidelines.

Rules are things which can be built into triggers (https://github.com/ArctosDB/DDL/blob/master/triggers/uam_triggers/geog_auth_rec.sql). These cannot be bypassed by anyone for any reason. We could decide that the word "Drainage" must [or must not] exist in the field Drainage, and Arctos could require [or reject] any data which do [or do not] contain that string, for example.

Guidelines are things which are outlined at http://handbook.arctosdb.org/documentation/higher-geography.html#guidelines-for-geographic-terms-in-arctos. These are sometimes actionable as warnings - eg, the "Tijuana River Basin (drainage) does not occur in Source" thing in the edit geography screenshot above comes from...

screen shot 2018-04-25 at 9 13 04 am

... - but they are not and cannot be "hard" rules. (In this example the warning is just my weird vocabulary not quite lining up with wikipedia's weird vocabulary, not an indication that I've actually linked to an inappropriate "authority.")

Lacking surprises, I should be able to get this to production early next week (or MAYBE tomorrow). We can deal with the data as part of the release, or later, or a bit of both, or whatever.

amsnyder210 commented 6 years ago

Thank you, Dusty.

Responses to your three questions:

  1. "Fish Hatchery" is indeed "code" for "not wild caught" or that researchers, for various reasons (morphological/genetic anomalies), will not want to consider those collections.
  2. Yes, for now this is a complete list for the records of MSB fishes that related to the region. Other providers should be able to add drainage at the discretion of Arctos guidelines.
  3. Using "Drainage" in query for a field labeled Drainage is redundant. If you chose to include this word, it should be done as long as researcher does not have to include that term in their query to acquire a list of records.

I will check out the query for Drainage and contact you again. Lex

On Wed, Apr 25, 2018 at 10:47 AM, dustymc notifications@github.com wrote:

This is mostly implemented and can be tested at arctos-test.tacc.utexas.edu.

Specimen Search:

[image: screen shot 2018-04-25 at 8 49 05 am] https://user-images.githubusercontent.com/5720791/39257121-991fb2d4-4865-11e8-86a1-a25e4a3a34be.png

results

[image: screen shot 2018-04-25 at 8 50 17 am] https://user-images.githubusercontent.com/5720791/39257173-b56b778e-4865-11e8-847a-460b3300b24a.png

specimen detail

[image: screen shot 2018-04-25 at 8 50 53 am] https://user-images.githubusercontent.com/5720791/39257201-ce58320a-4865-11e8-956a-2dcef925a464.png

http://arctos-test.tacc.utexas.edu/guid/MVZ:Mamm:202690

edit geography

[image: screen shot 2018-04-25 at 8 53 32 am] https://user-images.githubusercontent.com/5720791/39257359-2e315c4c-4866-11e8-8871-cc6d4eff8d6c.png

Feedback appreciated, but I don't think there's anything very surprising. (Except perhaps the data themselves, which should be expected in the test environment.)

I extracted unique higher_geog + locality remarks through the string "Drainage" from production - data attached. These could be created as new geography once this structure goes to production.

temp_msb_f_d.csv.zip https://github.com/ArctosDB/arctos/files/1947878/temp_msb_f_d.csv.zip

Distinct drainages from that:

DRAINAGE

Gulf of Mexico Drainage Arkansas River Drainage Tularosa Basin Drainage Guzman Basin Drainage Canadian River Drainage San Juan River Drainage Pecos River Drainage Colorado River Drainage Fish Hatchery Drainage Gila River Drainage Pacific Ocean Slope Drainage Great Basin Drainage Arctic Ocean Drainage Rio Grande Drainage Mississippi River Drainage Great Lakes Drainage Zuni River Drainage Atlantic Ocean Slope Drainage Missouri River Drainage

19 rows selected.

A few preliminary questions:

  • Is "Fish Hatchery Drainage" a place, or secret code for "you may not want to use this specimen for some things"?
  • Does everything else look like what's desired in drainage?
  • Should we drop the word "drainage" from the data? It's redundant once these data are in a field labeled drainage, but sometimes it's worth keeping that stuff around anyway.

I'm basically looking for rules and guidelines.

Rules are things which can be built into triggers ( https://github.com/ArctosDB/DDL/blob/master/triggers/uam_ triggers/geog_auth_rec.sql). These cannot be bypassed by anyone for any reason. We could decide that the word "Drainage" must [or must not] exist in the field Drainage, and Arctos could require [or reject] any data which do [or do not] contain that string, for example.

Guidelines are things which are outlined at http://handbook.arctosdb.org/ documentation/higher-geography.html#guidelines-for- geographic-terms-in-arctos. These are sometimes actionable as warnings - eg, the "Tijuana River Basin (drainage) does not occur in Source" thing in the edit geography screenshot above comes from...

[image: screen shot 2018-04-25 at 9 13 04 am] https://user-images.githubusercontent.com/5720791/39258478-e6323256-4868-11e8-944c-369ca7163bc3.png

... - but they are not and cannot be "hard" rules. (In this example the warning is just my weird vocabulary not quite lining up with wikipedia's weird vocabulary, not an indication that I've actually linked to an inappropriate "authority.")

Lacking surprises, I should be able to get this to production early next week (or MAYBE tomorrow). We can deal with the data as part of the release, or later, or a bit of both, or whatever.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/ArctosDB/arctos/issues/1273#issuecomment-384355546, or mute the thread https://github.com/notifications/unsubscribe-auth/Aee7_oQMWgK_498cEsctqDSBW_bsU24kks5tsKi1gaJpZM4PbO6r .

--


Alexandra M Snyder Collections Manager-Fishes Museum of Southwestern Biology MSC01-2020 University of New Mexico Albuquerque NM 87131 USA

PH.505.277.6005 OFFICE

Physical address for FedEx and UPS CERIA 83 Room 204 302 Yale Blvd NE

http://msb.unm.edu/

dustymc commented 6 years ago

This is now in production and I made new geography from the data posted above - http://arctos.database.museum/geography.cfm?drainage=_

Please review and let me know if I should update specimens.

amsnyder210 commented 6 years ago

Dusty,

A quick check through higher geography drainages generated via MSB fishes database finds a few errors that came from some stray records. There are just a few, so I will edit these MSB specimen records in Arctos.

  1. Pecos River: delete Bernalillo County designation.
  2. San Juan River: delete Bernalillo County designation.
  3. San Juan River: Colorado State=Montezuma County (not San Juan County)
  4. San Juan River: delete San Miguel County designation.
  5. Tularosa Basin: delete Catron County designation.

Thank you. Lex

On Thu, Apr 26, 2018 at 10:32 PM, dustymc notifications@github.com wrote:

This is now in production and I made new geography from the data posted above - http://arctos.database.museum/geography.cfm?drainage=_

Please review and let me know if I should update specimens.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/ArctosDB/arctos/issues/1273#issuecomment-384861077, or mute the thread https://github.com/notifications/unsubscribe-auth/Aee7_jEtpib84-zM2k5obzY3uGoQJOvoks5tsp9XgaJpZM4PbO6r .

--


Alexandra M Snyder Collections Manager-Fishes Museum of Southwestern Biology MSC01-2020 University of New Mexico Albuquerque NM 87131 USA

PH.505.277.6005 OFFICE

Physical address for FedEx and UPS CERIA 83 Room 204 302 Yale Blvd NE

http://msb.unm.edu/

dustymc commented 6 years ago

Lex: I did NOT update any specimens, I just created new geography based on existing + data from locality remarks. Please let me know if any of the above don't exist (eg, there is no Pecos River, Bernalillo County) and I'll delete the authority data. If eg Pecos River, Bernalillo County DOES exist but there are just no specimens, we can keep it and it'll be available when someone needs it.

I pulled your data with drainage out and applied these corrections:


 update temp_msb_f_f_u set new_higher_geog='North America, United States, New Mexico, Pecos River' where 
    new_higher_geog='North America, United States, New Mexico, Bernalillo County, Pecos River';

 update temp_msb_f_f_u set new_higher_geog='North America, United States, New Mexico, San Juan River' where 
    new_higher_geog='North America, United States, New Mexico, Bernalillo County, San Juan River';

 update temp_msb_f_f_u set new_higher_geog='North America, United States, Colorado, Montezuma County, San Juan River' where 
    new_higher_geog='North America, United States, Colorado, San Juan County, San Juan River';

 update temp_msb_f_f_u set new_higher_geog='North America, United States, New Mexico, San Juan River' where 
    new_higher_geog='North America, United States, New Mexico, San Miguel County, San Juan River';

 update temp_msb_f_f_u set new_higher_geog='North America, United States, New Mexico, Tularosa Basin' where 
    new_higher_geog='North America, United States, New Mexico, Catron County, Tularosa Basin';

Data attached. Please review, and if everything looks as it should I can replace HIGHER_GEOG (current data, does not contain drainage) with NEW_HIGHER_GEOG (does contain drainage).

temp_msb_f_f_u(1).csv.zip

amsnyder210 commented 6 years ago

Thank you, Dusty. The 5 corrections I sent you are true statements regardless of specimen records associated or not. I.e., Bernalillo Co. will never, ever incorporate Pecos River and it's drainages...ever. (Bernalillo County=Rio Grande Drainage.) These 5 errors were generated from erroneous data in the MSB fishes database.

Lex

On Fri, Apr 27, 2018 at 12:32 PM, dustymc notifications@github.com wrote:

Lex: I did NOT update any specimens, I just created new geography based on existing + data from locality remarks. Please let me know if any of the above don't exist (eg, there is no Pecos River, Bernalillo County) and I'll delete the authority data. If eg Pecos River, Bernalillo County DOES exist but there are just no specimens, we can keep it and it'll be available when someone needs it.

I pulled your data with drainage out and applied these corrections:

update temp_msb_f_f_u set new_higher_geog='North America, United States, New Mexico, Pecos River' where new_higher_geog='North America, United States, New Mexico, Bernalillo County, Pecos River';

update temp_msb_f_f_u set new_higher_geog='North America, United States, New Mexico, San Juan River' where new_higher_geog='North America, United States, New Mexico, Bernalillo County, San Juan River';

update temp_msb_f_f_u set new_higher_geog='North America, United States, Colorado, Montezuma County, San Juan River' where new_higher_geog='North America, United States, Colorado, San Juan County, San Juan River';

update temp_msb_f_f_u set new_higher_geog='North America, United States, New Mexico, San Juan River' where new_higher_geog='North America, United States, New Mexico, San Miguel County, San Juan River';

update temp_msb_f_f_u set new_higher_geog='North America, United States, New Mexico, Tularosa Basin' where new_higher_geog='North America, United States, New Mexico, Catron County, Tularosa Basin';

Data attached. Please review, and if everything looks as it should I can replace HIGHER_GEOG (current data, does not contain drainage) with NEW_HIGHER_GEOG (does contain drainage).

temp_msb_f_f_u(1).csv.zip https://github.com/ArctosDB/arctos/files/1956222/temp_msb_f_f_u.1.csv.zip

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/ArctosDB/arctos/issues/1273#issuecomment-385056749, or mute the thread https://github.com/notifications/unsubscribe-auth/Aee7_sFiuNNSHny8X8bToAIDJyk1sxrjks5ts2RPgaJpZM4PbO6r .

--


Alexandra M Snyder Collections Manager-Fishes Museum of Southwestern Biology MSC01-2020 University of New Mexico Albuquerque NM 87131 USA

PH.505.277.6005 OFFICE

Physical address for FedEx and UPS CERIA 83 Room 204 302 Yale Blvd NE

http://msb.unm.edu/

dustymc commented 6 years ago

Got it, they're gone.

dustymc commented 6 years ago

Closing - please open a new issue for any remaining concerns