A high-level Python framework to evaluate the skill of geospatial datasets by comparing candidates to benchmark maps producing agreement maps and metrics.
Over the past few weeks, we've made alot of progress implementing features and advancing the documentation. However, there are a few items we should revisit prior to advancing the project forward as well as beginning to apply it to FIM evals. The following list is meant to serve as checklist to be updated when items complete. Any large tasks can be spunoff into issues and marked with an X here.
[x] How does cat_plot handle multi-cats? What about datasets?'
[x] cat_plot() references not rendering in docs properly.
[x] Continuous example implementation.
[x] Implement cont_plot()
[x] Broken links in documentation
[x] Rename "extensions" in documentation to something more context specific not software specific.
[x] class gval.statistics.categorical_statistics.CategoricalStatistics could use a review of available attrs and methods. It's not entirely clear that all of these should be public. Currently only register_functions has a public use case. It would also be helpful to list what the users have access to when they choose 'all' metrics. This should include a function that returns function names and objects possibly as a dictionary. There are alot of other attrs and methods that should probably be private. The public ones should have examples and more details so that a user can better internalize what's going on. Private attrs/methods should not be exposed in documentation.
[x] Continuous statistics are currently not listed in documentation.
[x] We have no means of registering continuous statistics.
[x] Can schemas have columns defined in the documentation? Can columns() class method be listed where it is inherited or is that not automatic?
[x] For gval.statistics.categorical_stat_funcs the Returns portion of the docstring should return object type float then the description provided below that.
[x] Link under CSI should be reference.
[x] Is there a good explanation of what prevalence threshold is? I see the eq. on wikipedia but no explanation?
[x] Number 7 in contributing guidelines talks about documentation then switches context to readme. Maybe make one step for readme then another step for docs. Start off by saying that readme is composed of XX.md, XX.md, and .... Then say those should be edited directly then compiled using XX command. Then make a docs step highlighting that any changes to readme involves a change in docs.
[x] Should the command in number 9 of contributing guidelines have a -u flag?
[x] I don't think we are currently supporting docker so this information in the contributing might be misleading. If we want to brush up the Dockerfile we can but we should change the title of this section up to reflect that this section is for Docker use and may have user or developer applications.
[x] Last sentence in positive categories argument for categorical_compare() method should be read differently. A better reference to average argument should be made.
[x] Metrics argument should include some reference to what "all" means and how to get that list of metrics.
[x] comparison_function argument in categorical_compare does not have a reference to all possible functions. How does someone list or find these?
[x] allow_candidate_argument docstring reads a typo as " If “pairing_dict” is set selected".
[x] weights argument should explicitly state that it's used when average = "weighted"
[x] rasterize_attributes argument should state that this is applicable when a geodataframe is passed as the benchmark_map argument. Also what is the behaviour of None value? Basically what is the default behaviour?
[x] The terms statistics and metrics are used interchangebaly. Should we just use one term? Statistics?
[x] GVALDataFrame compute_metrics should probably be more specific: compute_categorical_metrics
Over the past few weeks, we've made alot of progress implementing features and advancing the documentation. However, there are a few items we should revisit prior to advancing the project forward as well as beginning to apply it to FIM evals. The following list is meant to serve as checklist to be updated when items complete. Any large tasks can be spunoff into issues and marked with an X here.
class gval.statistics.categorical_statistics.CategoricalStatistics
could use a review of available attrs and methods. It's not entirely clear that all of these should be public. Currently only register_functions has a public use case. It would also be helpful to list what the users have access to when they choose 'all' metrics. This should include a function that returns function names and objects possibly as a dictionary. There are alot of other attrs and methods that should probably be private. The public ones should have examples and more details so that a user can better internalize what's going on. Private attrs/methods should not be exposed in documentation.columns()
class method be listed where it is inherited or is that not automatic?gval.statistics.categorical_stat_funcs
the Returns portion of the docstring should return object typefloat
then the description provided below that.compute_categorical_metrics