Open rwjam opened 3 years ago
@rwjam Not sure if this is the best place to post about this issue so feel free to move it where you see fit Why WestDAAT does not have an owner classification value ="United States of America"? as we have it under the CVs list in the google sheet? https://westdaat.westernstateswater.org/
https://docs.google.com/spreadsheets/d/1tZ3DIYDx7J-dsldHfihQjOABibMpFoGKvmSe8ReKW10/edit?usp=sharing
I was looking for Idaho as an example where they said they do have owners as federal or tribal. The general federal owner is usually the United States of America.
For the tribal rights in Idaho, it appears that WestDAAT only shows 24 rights (see the pie chart). Based on what Jerry Rigby said, there should be more tribal rights. Could you look more into the classification in Idaho? https://westdaat.westernstateswater.org/?state=N4Ig7gTiBcoPZgHYFMIGEA2BDAzjglgGb4DGWALvnIjjANogByF%2BAbsgAQCCAtqqVkQgAugBoQOchWS1oDAJIAREeIAOcDAE8M%2BFPTEhEGACb4AYvgzlUiilhihslcgFdjyGIhcYM4jNQBzfFd3T29fEFMIZBJKalkAZnFjOwAFOF1yWQB2AF9c8R4sVQcQAC84OB4AGWR2DBgATgA6AA4ANgTG7PaAFgBWRtbWxt7WgCY-FhCPaF7x5oTs7I7e7IBGdv72xvX%2Bv0Dgt1mAWnXN5t726-bx1vXWq4SABlb8oA
Why WestDAAT does not have an owner classification value ="United States of America"? as we have it under the CVs list in the google sheet?
Ran a quick SQL script. We have 41,450 water rights with a OwnerClassificaitonCVs labeled as “United States of America”
I was looking for Idaho as an example where they said they do have owners as federal or tribal. The general federal owner is usually the United States of America. For the tribal rights in Idaho, it appears that WestDAAT only shows 24 rights (see the pie chart). Based on what Jerry Rigby said, there should be more tribal rights. Could you look more into the classification in Idaho?
Currently we should be showing 69 water rights in Idaho with a OwnerClassificaitonCVs labeled as “Native American” and another 29,474 water rights with a OwnerClassificaitonCVs labeled as “United States of America.” It’s possible we are missing some, but that might get into some values being truncated as we are only supporting the first value given in a owner list.
(see comments below) We use a string search based on similar words provided. For Native American, we are looking for ["tribe", "tribes", "nation", "nations", "indians"] words, & for United States of America we are looking for ["united states of america", "united states america", "usa"] words.
As a heads up those links you are sharing are for the production database / prod WestDAAT, which we has not been updated and does not do a good job at reflecting the work we have done in the last several months. The uat version(s) are showing slightly better results. This gets into difficulties we've had with copying data from the uat to the prod, which sounds like DPL might try to improve for us (based on 01/05/2023 meeting notes).
A quick comment on we are currently approaching the OnwerClassificationCV value.
We use a custom function to assign OnwerClassificationCV based on a provided owner value. We store that function and documentation on GitHub here: https://github.com/WSWCWaterDataExchange/MappingStatesDataToWaDE2.0/tree/master/5_CustomFunctions/OwnerClassification
Owner type is technically a 1-M relationship, with a single water right -to- multiple owners. However, we don’t support that relationship in WaDE. As of right now we assign OwnerClassificaitonCV based on an ordered list of values, with the exception that we truncate anything we consider to be “In Review” to be the least important (see code image). Then it’s just a simple match the first word in the provided list.
This get's into similar issues to how we treat PrimaryBeneficialUseCategory. We use that field for categorization and labeling with the legend for WestDAAT, as WestDAAT does not support trying to color a single water right site with multiple beneficial uses. We would run into the same issue here with OwnerClassificaitonCV. We could use the same approach with OwnerClassificaitonCV as we do PrimaryBeneficialUseCategory and use a hierarchy approach on determining which OwnerClassificaitonCV we consider to be more important, we just need to be ready to make that arugment of how we made our selecetions.
In addition to listing the owner of a water right, WSWC would also like to categorize the owners into groups. However, states do not track this information so it is up to us to create this.
Initial large scale categories thus far include: Privately Owned, Commercial, Military, Government, & Natural Resources. With each large scale category broken up further into smaller, more defined groups (e.g., Bureau of Land Management, etc). Initial design processes has used keyword search that starts with a generic search, that is replaced with a more specific search (see for loop approach in code).
Work most likely to be assigned to an WSWC intern. Intern responsibilities would be as follows...
A clean input would help reduce the amount of errors and exceptions that need to be provided. Each state has a different method for listing owners (e.g., single owner, multiple owner and owner types, last name included but no first name, etc), so cutting down the number of exceptions per state will improve accuracy. As long as it's a duplicate from the original owner provided information, that should keep WaDE's promise not to override state given data
OwnerClassificationCV Table located here: https://docs.google.com/spreadsheets/d/1tZ3DIYDx7J-dsldHfihQjOABibMpFoGKvmSe8ReKW10/edit?ts=5de93fa0#gid=1004421132
Initial work, script, and notes located here: https://github.com/WSWCWaterDataExchange/MappingStatesDataToWaDE2.0/tree/master/OwnerClassification