Open danfowler opened 8 years ago
Hey @danfowler
@pwalsh updated the stats above. What do you mean by resolve package name issues?
Note as per earlier discussions a few months ago we do not need to migrate the private datasets IMO. In general private datasets were just datasets people never completed and got around to publishing and we can just leave them (note: we can archive them somewhere safely like OK's standard backup).
BTW could we get a CSV list of all the public datasets and their owners?
@danfowler could i suggest gists in future - much nicer to read and update ;-) For others here's a datapipes preview: http://datapipes.okfnlabs.org/csv/html?url=https://dl.dropboxusercontent.com/u/12909676/owners.csv
Numbers
Private datasets: 586 (981 datasets have empty or incomplete "data" model) Public datasets: 1094
Sizes
Public Datasets
without archived sources: 9.86 GB with archived sources: 17.32 GB
Private Datasets
without archived sources: 5.1 GB with archived sources: 10.61 GB
Sources
Private Datasets
valid_sources: 456 (78% of total sources) invalid_sources: 128
Public Datasets
valid_sources: 1396 (92% of total sources) invalid_sources: 120
Usernames
private
1 owner: 557 (2.7 GB) 2 owners: 19 (768.1 MB) 3 owners: 7 (1.6 GB) 4 owners: 1 (0) 5 owners: 2 (382.2 kB) 6 owners: 1 (3.0 MB) 8 owners: 1 (0)
public
0 owners: 8 (134.5 MB) 1 owner: 987 (4.2 GB) 2 owners: 75 (4.4 GB) 3 owners: 13 (557.3 MB) 4 owners: 8 (83.8 MB) 5 owners: 3 (461.3 MB)