DOI-USGS / gems-tools-arcmap

Tools for working with the GeMS geologic map database schema in ArcGIS
Creative Commons Zero v1.0 Universal
42 stars 21 forks source link

DataSources Null values not permitted. #58

Open LindaTedrow opened 3 years ago

LindaTedrow commented 3 years ago

In many of our feature classes we have a DataSourceID and a DataSourceID2. According to the GeMS standard this value cannot be null. There is not an issue with DataSourceID not being NULL. Although, there are many instances where DataSourceID2 is NULL; the validation code does not allow. Is it possible to allow DataSourceID2 to be NULL?

ethoms-usgs commented 3 years ago

Hi Linda, I just added a DataSourceID2 field to a test gdb and it passed the validation tool. The field was simply recognized as a non-GeMS field. Do you have the latest version?

Are you adding that field because of the need to associate more than one DataSource to a feature?

LindaTedrow commented 3 years ago

Hello Evan, Thank you for checking this. No, I do not currently have the latest version and I have not checked this recently. This has been on my “to do” list for a while and I was getting ready to run script again.

I will make sure we are using the latest before next validation.

Yes, we are using this field to associate more than one DataSource to a feature. The idea of having more than one DataSource seems odd; but, when working with legacy databases, I think I must keep both.

Thank you,

Linda

From: Evan Thoms notifications@github.com Sent: Thursday, February 18, 2021 1:15 PM To: usgs/gems-tools-arcmap gems-tools-arcmap@noreply.github.com Cc: Tedrow, Linda (ltedrow@uidaho.edu) ltedrow@uidaho.edu; Author author@noreply.github.com Subject: Re: [usgs/gems-tools-arcmap] DataSources Null values not permitted. (#58)

Hi Linda, I just added a DataSourceID2 field to a test gdb and it passed the validation tool. The field was simply recognized as a non-GeMS field. Do you have the latest version?

Are you adding that field because of the need to associate more than one DataSource to a feature?

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHubhttps://urldefense.com/v3/__https:/github.com/usgs/gems-tools-arcmap/issues/58*issuecomment-781639006__;Iw!!JYXjzlvb!3kNXPcL83Q3b7OfkKIB1084rMdsQBIuoqBhXUPSKxf4rxR4aR03OU5lo-RpIRtX5$, or unsubscribehttps://urldefense.com/v3/__https:/github.com/notifications/unsubscribe-auth/AKYEVB55ZBDCOUTSCKQ235TS7V7NLANCNFSM4XZGVQQQ__;!!JYXjzlvb!3kNXPcL83Q3b7OfkKIB1084rMdsQBIuoqBhXUPSKxf4rxR4aR03OU5lo-Yeaui5p$.

ethoms-usgs commented 3 years ago

All right. I know that the documentation implies a one-to-one relationship and that is what the tool tests for, but the cardinality is not explicitly described and I think we should be allowing for a many-to-many relationship. Mostly I think this because 1) I don't think it's the place of the GeMS schema to say there can only be one source per feature and 2) the alternative, where people add fields that are not covered by the schema; essentially an un-normalized, flattened spreadsheet sort of design, is poor database design. I am encouraging all GeMS groups to use the kind of relationship they think is best and that should force us (me and Ralph) to re-write the tools to accommodate them.

To my mind, I don't care how many data sources get related to a feature, I just want there to be at least one and that related records can be discovered reliably and programmatically. A relationship class inside the gdb can be queried and even in the absence of one, primary key values can be searched for in a foreign table.

If you go that route, I understand that DataSourceID still needs to be filled in and the value might not make much sense but just do what you think it best there and we'll work with that moving forward.

LindaTedrow commented 3 years ago

Evan,

There is the old database normalizing rule. Normalize till it hurts, then back up.

I do understand the requirement for DataSourceID to be filled.

It is only the second datasource that is of question. Mainly, I do not want to put “NA” or “-9999” or something like that into all the second data source fields that do not have a second data source. Those features that have second datasources are not the norm but they do exist.

Thanks again. Linda

From: Evan Thoms notifications@github.com Sent: Thursday, February 18, 2021 1:56 PM To: usgs/gems-tools-arcmap gems-tools-arcmap@noreply.github.com Cc: Tedrow, Linda (ltedrow@uidaho.edu) ltedrow@uidaho.edu; Author author@noreply.github.com Subject: Re: [usgs/gems-tools-arcmap] DataSources Null values not permitted. (#58)

All right. I know that the documentation implies a one-to-one relationship and that is what the tool tests for, but the cardinality is not explicitly described and I think we should be allowing for a many-to-many relationship. Mostly I think this because 1) I don't think it's the place of the GeMS schema to say there can only be one source per feature and 2) the alternative, where people add fields that are not covered by the schema; essentially an un-normalized, flattened spreadsheet sort of design, is poor database design. I am encouraging all GeMS groups to use the kind of relationship they think is best and that should force us (me and Ralph) to re-write the tools to accommodate them.

To my mind, I don't care how many data sources get related to a feature, I just want there to be at least one and that related records can be discovered reliably and programmatically. A relationship class inside the gdb can be queried and even in the absence of one, primary key values can be searched for in a foreign table.

If you go that route, I understand that DataSourceID still needs to be filled in and the value might not make much sense but just do what you think it best there and we'll work with that moving forward.

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHubhttps://urldefense.com/v3/__https:/github.com/usgs/gems-tools-arcmap/issues/58*issuecomment-781660233__;Iw!!JYXjzlvb!0PvFdZ8iLqaBsKnWwu7vwCNM4V8vSBnfzu50X5hvQKrmV1tKdxm1hlbwChxwZ6OW$, or unsubscribehttps://urldefense.com/v3/__https:/github.com/notifications/unsubscribe-auth/AKYEVB57WP66BIMH5N3FL6TS7WEGPANCNFSM4XZGVQQQ__;!!JYXjzlvb!0PvFdZ8iLqaBsKnWwu7vwCNM4V8vSBnfzu50X5hvQKrmV1tKdxm1hlbwCqOtPEgk$.

ethoms-usgs commented 3 years ago

Hi Linda, I am revisiting outstanding issues. Do we need to keep this open? I would say use a concatenated key value, eg "DAS01 | DAS05" -- data source ids delimited by pipe characters or use your second DataSource field and since you have added it as an extension, you can defined it's attributes how you like. You can allow nulls. It will never be checked by the Validate Database tool.