datopian / datahub-qa

:package: Bugs, issues and suggestions for datahub.io
https://datahub.io/
32 stars 6 forks source link

[assembler] structure of zipped dataset (archive) #129

Closed AcckiyGerman closed 6 years ago

AcckiyGerman commented 6 years ago

As the data get will now fetch and unpack archived version of the dataset - we should fix it has a fancy and valid structure.

How to reproduce

Expected behavior

zelima commented 6 years ago

@AcckiyGerman pushes after today should have nice structure with readme and archive folders

AcckiyGerman commented 6 years ago

@zelima This is a very cool fix!

AcckiyGerman/finance-vix$ tree
.
├── archive
│   └── vix-daily.csv
├── data
│   ├── validation_report.json
│   ├── vix-daily.csv
│   ├── vix-daily_csv.csv
│   └── vix-daily_json.json
├── datapackage.json
└── README.md

I have only one suggestion:

$ head data/vix-daily.csv 
Date,VIXOpen,VIXHigh,VIXLow,VIXClose
2004-01-02,17.96,18.68,17.54,18.22
2004-01-05,18.45,18.49,17.44,17.49
2004-01-06,17.66,17.67,16.19,16.73
2004-01-07,16.72,16.75,15.05,15.05
2004-01-08,15.42,15.68,15.32,15.61
2004-01-09,16.15,16.88,15.57,16.75
2004-01-12,17.32,17.46,16.79,16.82
2004-01-13,16.06,18.33,16.53,18.04
2004-01-14,17.29,17.03,16.04,16.75

$ head data/vix-daily_csv.csv 
Date,VIXOpen,VIXHigh,VIXLow,VIXClose
2004-01-02,17.96,18.68,17.54,18.22
2004-01-05,18.45,18.49,17.44,17.49
2004-01-06,17.66,17.67,16.19,16.73
2004-01-07,16.72,16.75,15.05,15.05
2004-01-08,15.42,15.68,15.32,15.61
2004-01-09,16.15,16.88,15.57,16.75
2004-01-12,17.32,17.46,16.79,16.82
2004-01-13,16.06,18.33,16.53,18.04
2004-01-14,17.29,17.03,16.04,16.75

Seems the last two are the same file - could we get rid of one of them?

zelima commented 6 years ago

@AcckiyGerman Yes we can, but I did this on purpose - to be aligned with original dp.json (we do have the same duplicate in original dp.json as well) Let's close this one open after we discuss that.

AcckiyGerman commented 6 years ago

FIXED: User get a valid dataset with pretty folders and files structure after data get