OCHA-DAP / Data-Team

A place for tracking data team issues
0 stars 1 forks source link

Delete duplicate data series #52

Closed takavarasha closed 10 years ago

takavarasha commented 10 years ago

Delete the following four data series which are duplicated

luiscape commented 10 years ago

@takavarasha Have these indicators been deleted yet?

takavarasha commented 10 years ago

Not yet. But this is a very low priority task because it does not block anything at the moment.

luiscape commented 10 years ago

Got it. I was just wondering to see if they also show up in any automated process when querying the API. :-)

cjhendrix commented 10 years ago

@luiscape The corresponding datasets should be deleted on ckan as well.

JavierTeran commented 10 years ago

@luiscape @cjhendrix I am thinking this is top priority, specially for demo purposes. @cjhendrix We do we need to delete them from? CPS?

JavierTeran commented 10 years ago

@luiscape This is the issue I was referring to re: Duplicates

cjhendrix commented 10 years ago

They will have to be deleted in both places, starting with CPS. Otherwise, they will reappear on ckan the next time the script is run.

luiscape commented 10 years ago

I deleted 13 datasets from CKAN. Here is the list:

However, there is some sort of strange behavior in the system. When you search or query the API you don't receive a list with the entries above. But if you use the legacy URL left behind by them, for example data.hdx.rwlabs.org/dataset/zambia-baseline-data you can still access the dataset. Any ideas why?

alexandru-m-g commented 10 years ago

@luiscape What actually happens when someone deletes a dataset is that a flag is being changed in the database so that the respective dataset is marked as "deleted". Nevertheless the dataset information is still in the database (but it was removed from the solr search engine). Generally, ckan works like a revisioning system so historical data is being kept. Apparently sysadmins are still allowed to access the deleted datasets ( normal users aren't ).

luiscape commented 10 years ago

@alexandru-m-g Interesting behavior -- thank you for the explanation. Now, notice that I am an admin in the system. How can I delete permanently a dataset using the UI?

luiscape commented 10 years ago

@JavierTeran What is the status of this task? The 4 duplicate indicators apparently were deleted by @takavarasha and I deleted the duplicate datasets. Is this task done?

cjhendrix commented 10 years ago

@luiscape Sadly, you can't permanently delete a dataset. Something users may want us to address.

alexandru-m-g commented 10 years ago

@cjhendrix @luiscape What we can do pretty easily though, is to not allow the display of deleted ( marked as deleted) datasets for anyone (even for sysadmins). The user then wouldn't know what's going on behind the scenes. Or to show some clear warning on the page when someone's viewing a deleted dataset. I don't think either would be a big change.

luiscape commented 10 years ago

@alexandru-m-g Warning sounds interesting. In my profile page, I can see a little badge that says a dataset has been deleted. However, I can't see that warning within the dataset page. Adding it would be quite nice, in my opinion. :-)

screen shot 2014-05-20 at 11 55 01 am

alexandru-m-g commented 10 years ago

@cjhendrix should I create a ticket for this and put it in the backlog ( or in Sprint ) ?

cjhendrix commented 10 years ago

I think that's a good quick solution. We could basically just show the status badge (there is also a "draft" badge for datasets without files) on any dataset list.

alexandru-m-g commented 10 years ago

Opened https://github.com/OCHA-DAP/hdx-ckan/issues/356

JavierTeran commented 10 years ago

@takavarasha please confirm this activity is completed and close this issue. Thanks.