Coleridge-Initiative / RCDatasets

Creative Commons Zero v1.0 Universal
3 stars 2 forks source link

dataset_updates #118

Closed menoah closed 4 years ago

menoah commented 4 years ago

There are two problems:

  • United States Geological Survey is a duplicate -- already in providers.json
  • the indexing conflicts with the previous pull request, and most of the changes require renumbering

Sorry about that:

ceteri commented 4 years ago

Thank you @menoah

Although there are still merge conflicts in the providers.json with several entries that have overlapping IDs from the previous PR.

You might try updating your branch from current master first, then applying the new numbering -- before committing again.

ceteri commented 4 years ago

This is close -- though there are still two missing providers:

(base) derwen:~/src/RCDatasets$ ./test.py
..unknown providers: ['provider-279', 'provider-280']
F
srand525 commented 4 years ago

Hi Marcos a few issues that came up while running the test...

F
650 datasets loaded

279 providers loaded
.....E
======================================================================
ERROR: test_unique_titles (__main__.TestVerifyDatasets)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "test.py", line 96, in test_unique_titles
    raise Exception("{}: duplicate title {}".format(dataset["id"], title))
Exception: dataset-652: duplicate title National Elevation Dataset

======================================================================
FAIL: test_enum_providers (__main__.TestVerifyDatasets)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "test.py", line 134, in test_enum_providers
    self.assertTrue(len(unknowns) < 1)
AssertionError: False is not true

----------------------------------------------------------------------
ceteri commented 4 years ago

looks good now