erdc / quest

Python API for downloading and managing data. Checkout the documentation at:
https://quest.readthedocs.io/en/latest/
BSD 3-Clause "New" or "Revised" License
14 stars 9 forks source link

Rename `Filters` and `Features` #89

Closed sdc50 closed 5 years ago

sdc50 commented 6 years ago

Filters and Features should be given more intuitive names, such as tools and sites or locations.

sdc50 commented 6 years ago

sites or locations makes sense for geospatial data. However, it doesn't really have any relevance for non-geo data. Our current data model is structured such that datasets belong to a feature and features belong to a collection. I would like to keep the data model consistent for all datasets (geo and non-geo), but I don't have a good idea for a more generic name for features that would make sense in the context of non-geo data. Any suggestions?

dharhas commented 6 years ago

features came from the OGC SOS2 concept of features, so in that sense its a reasonable name. It just is awkward for folks who are not familiar with the standard.

AaronV77 commented 6 years ago

So I looked up some words that I found interesting:

Let me know. Thanks

sdc50 commented 6 years ago

@AaronV77 I think the terms "property" and "attribute" are too generic and have a specific meaning in code that I think would cause confusion. I think I leaning toward using the term locations but then changing the database structure so that datasets have a direct link to collections and the locations are only populated for geo-specific datasets.

AaronV77 commented 6 years ago

@sdc50 I think property has a ton of meanings to it and not just code, but I think it fits better in our data model. I'm not really feeling the name 'locations' because it doesn't fit our non-geo data model real well. I would love to have a name that can represent both non-geo and geo data. Thanks

sdc50 commented 6 years ago

@AaronV77 I agree with you that property has a ton of meanings. Datasets have many properties and that's why I don't think it makes sense to officially assign one property (i.e. its feature) the name property. Maybe I wasn't clear in my previous post, but I was suggesting that the term location be used to describe a specific property only for the geographic datasets. The non-geo datasets wouldn't have a location property.

AaronV77 commented 6 years ago

@sdc50 my apologies, ya the way that I understood was that you wanted to try and change the name to 'locations'. I would agree with adding 'location' somewhere within the metadata, but I'm worried still about the word 'feature' having more than one meaning with the Quest code / documentation. Thanks

sdc50 commented 6 years ago

Let me try to clarify. I am proposing that we change the name of features to locations. I am also proposing that we change the data model so that not all datasets have a location. This would have a few implications:

We will need to rethink our searching tools. Right now we search data by downloading metadata from each service that gives us an idea of what data it might have by listing the locations/features and the parameters. The problem is that many datasets that we are interested in don't fit the location/parameter paradigm.

sdc50 commented 6 years ago

Renaming filters to tools:

AaronV77 commented 6 years ago

Can we do 'get_tool_options' rather than 'get_tools_options'? Thanks

AaronV77 commented 6 years ago

Also can we not call 'get_filters' -> 'list_tools', and instead call it 'get_tools'? All the other methods follow this pattern, it would not make sense to change it now?

sdc50 commented 6 years ago

Renaming features to catalog:

This refactor will remove the idea of collection features. Datasets will now point directly to Collections and will reference the service/service_id where it came from.

sdc50 commented 6 years ago

Here are some proposed changes from the database file (for reference):

    class Collection(db.Entity):
        name = orm.PrimaryKey(str)
        display_name = orm.Optional(str)
        description = orm.Optional(str)
        created_at = orm.Required(datetime, default=datetime.now())
        updated_at = orm.Optional(datetime)
        metadata = orm.Optional(orm.Json)

        # setup relationships
        datasets = orm.Set('Dataset')

 # Feature table should be commented out. (possibly will be brought back as the Catalog table).

    class Dataset(db.Entity):
        name = orm.PrimaryKey(str)
        display_name = orm.Optional(str)
        description = orm.Optional(str, nullable=True)
        created_at = orm.Required(datetime, default=datetime.now())
        updated_at = orm.Optional(datetime)
        metadata = orm.Optional(orm.Json)

        # dataset require metadata
        parameter = orm.Optional(orm.Json)
        unit = orm.Optional(str)
        datatype = orm.Optional(str)
        file_format = orm.Optional(str)
        source = orm.Optional(str)
        options = orm.Optional(orm.Json)
        status = orm.Optional(str)
        message = orm.Optional(str)
        file_path = orm.Optional(str, nullable=True)
        visualization_path = orm.Optional(str)

        # setup relationships
        collection = orm.Required(Collection)
        catalog_entry = orm.Required(str)  # TODO for now this will just be the service uri with id but will be orm.Required(Catalog)
AaronV77 commented 6 years ago

@sdc50 you have get_features twice in the naming rework. Was there another method that you were thinking of or am I missing something. Thanks

sdc50 commented 5 years ago

This is addressed and closed by PR #111