open-data / ckanext-recombinant

Create datastore tables for organizations and provide combined output
Other
5 stars 8 forks source link

OPEN-194: load-csv: create missing datasets, remove org extras #87

Closed RabiaSajjad closed 5 years ago

wardi commented 5 years ago

The lenient flag can be removed now too

RabiaSajjad commented 5 years ago

Should I default strict=False? https://github.com/open-data/ckanext-recombinant/blob/60cba2e7892ce3d01f444e29c1a7a2da9dfc0cd8/ckanext/recombinant/commands.py#L256

wardi commented 5 years ago

Actually, that question points to csv_data_batch as a better place to remove the csv_org_extras columns.

wardi commented 5 years ago

You could add a remove_org_extras=False option to that function. In ckanext-canada/ckanext/canada/ati.py the org extras returned are used to populate the ATI solr index but for load-csv we would want to always remove them.

RabiaSajjad commented 5 years ago

looks like csv_data_batch is taking care of csv_org_extras in https://github.com/open-data/ckanext-recombinant/blob/60cba2e7892ce3d01f444e29c1a7a2da9dfc0cd8/ckanext/recombinant/read_csv.py#L29

wardi commented 5 years ago

No that just prevents an asssert failure when strict=True

RabiaSajjad commented 5 years ago

Hi Ian, I think strict is True by default

https://github.com/open-data/ckanext-recombinant/blob/ed9d723fa125e6e697a75e9df4a1e4c11458686a/ckanext/recombinant/commands.py#L253

Moreover, to test I started over again, delete ati, load the csv as-is, update the fields, migrate the csv and load it back. All completed with no issues. Let me know if you think further changes are required.

Thanks!