Closed sdc50 closed 5 years ago
sites
or locations
makes sense for geospatial data. However, it doesn't really have any relevance for non-geo data. Our current data model is structured such that datasets belong to a feature and features belong to a collection. I would like to keep the data model consistent for all datasets (geo and non-geo), but I don't have a good idea for a more generic name for features
that would make sense in the context of non-geo data. Any suggestions?
features came from the OGC SOS2 concept of features, so in that sense its a reasonable name. It just is awkward for folks who are not familiar with the standard.
So I looked up some words that I found interesting:
Let me know. Thanks
@AaronV77 I think the terms "property" and "attribute" are too generic and have a specific meaning in code that I think would cause confusion. I think I leaning toward using the term locations
but then changing the database structure so that datasets have a direct link to collections and the locations
are only populated for geo-specific datasets.
@sdc50 I think property
has a ton of meanings to it and not just code, but I think it fits better in our data model. I'm not really feeling the name 'locations' because it doesn't fit our non-geo data model real well. I would love to have a name that can represent both non-geo and geo data. Thanks
@AaronV77 I agree with you that property
has a ton of meanings. Datasets have many properties and that's why I don't think it makes sense to officially assign one property (i.e. its feature
) the name property
. Maybe I wasn't clear in my previous post, but I was suggesting that the term location
be used to describe a specific property only for the geographic datasets. The non-geo datasets wouldn't have a location
property.
@sdc50 my apologies, ya the way that I understood was that you wanted to try and change the name to 'locations'. I would agree with adding 'location' somewhere within the metadata, but I'm worried still about the word 'feature' having more than one meaning with the Quest code / documentation. Thanks
Let me try to clarify. I am proposing that we change the name of features
to locations
. I am also proposing that we change the data model so that not all datasets have a location
. This would have a few implications:
get_features
would be called get_locations
and would only search for geographic datasets.get_tags
would be expanded and used to search for non-geo datasets.collection
would be added as a new property directly to the datasets (right now datasets have to access their collection
through their feature
. feature
property of datasets would be called location
and would be null for non-geo datasets.We will need to rethink our searching tools. Right now we search data by downloading metadata from each service that gives us an idea of what data it might have by listing the locations/features
and the parameters
. The problem is that many datasets that we are interested in don't fit the location
/parameter
paradigm.
Renaming filters
to tools
:
api/filters.py
-> api/tools.py
get_filters
-> list_tools
apply_filter
-> run_tool
get_filter_options
-> get_tools_options
quest_filter_plugins/
-> quest_tool_plugins
plugin_namespaces['filters']
-> plugin_namespaces['tools']
, etc.plugins/base/filter_base.py
-> plugins/base/tool_base.py
FilterBase
-> ToolBase
FilterBase.apply_filter
-> ToolBase.run_tool
FilterBase._apply_filter
-> ToolBase._run_tool
Can we do 'get_tool_options' rather than 'get_tools_options'? Thanks
Also can we not call 'get_filters' -> 'list_tools', and instead call it 'get_tools'? All the other methods follow this pattern, it would not make sense to change it now?
Renaming features
to catalog
:
api/features.py
-> api/catalog.py
add_features
-> add_datasets
or add_datasets_for_catalog_entries
get_features
-> search_catalog
new_feature
-> new_catalog_entry
ProviderBase.get_features
-> ProviderBase.get_catalog
ServiceBase.get_features
-> ServiceBase.get_catalog
, etcThis refactor will remove the idea of collection features. Datasets will now point directly to Collections and will reference the service/service_id where it came from.
Here are some proposed changes from the database file (for reference):
class Collection(db.Entity):
name = orm.PrimaryKey(str)
display_name = orm.Optional(str)
description = orm.Optional(str)
created_at = orm.Required(datetime, default=datetime.now())
updated_at = orm.Optional(datetime)
metadata = orm.Optional(orm.Json)
# setup relationships
datasets = orm.Set('Dataset')
# Feature table should be commented out. (possibly will be brought back as the Catalog table).
class Dataset(db.Entity):
name = orm.PrimaryKey(str)
display_name = orm.Optional(str)
description = orm.Optional(str, nullable=True)
created_at = orm.Required(datetime, default=datetime.now())
updated_at = orm.Optional(datetime)
metadata = orm.Optional(orm.Json)
# dataset require metadata
parameter = orm.Optional(orm.Json)
unit = orm.Optional(str)
datatype = orm.Optional(str)
file_format = orm.Optional(str)
source = orm.Optional(str)
options = orm.Optional(orm.Json)
status = orm.Optional(str)
message = orm.Optional(str)
file_path = orm.Optional(str, nullable=True)
visualization_path = orm.Optional(str)
# setup relationships
collection = orm.Required(Collection)
catalog_entry = orm.Required(str) # TODO for now this will just be the service uri with id but will be orm.Required(Catalog)
@sdc50 you have get_features twice in the naming rework. Was there another method that you were thinking of or am I missing something. Thanks
This is addressed and closed by PR #111
Filters
andFeatures
should be given more intuitive names, such astools
andsites
orlocations
.