datopian / datahub-qa

:package: Bugs, issues and suggestions for datahub.io
https://datahub.io/
32 stars 6 forks source link

`data get` only acquires the html page for the dataset not the datapackage.zip #43

Closed MAliNaqvi closed 6 years ago

MAliNaqvi commented 6 years ago

Currently, data get acquires the html page for a dataset

Expected Behaviour: data get should get the datapackage.zip of the dataset

Steps to reproduce this error:

Alis-MBP-3:~ alinaqvi$ data get http://datahub.io/core/co2-ppm
Time elapsed: 0.64 s
Dataset/file is saved in "co2-ppm."
Alis-MBP-3:~ alinaqvi$ data --version
0.6.3
Alis-MBP-3:~ alinaqvi$ file co2-ppm. 
co2-ppm.: HTML document text, UTF-8 Unicode text, with very long lines
Alis-MBP-3:~ alinaqvi$ mv co2-ppm. co2-ppm.html
Alis-MBP-3:~ alinaqvi$ open co2-ppm.html

which shows: https://www.dropbox.com/s/abu72lhip7yyvln/Screenshot%202018-01-16%2017.22.28.png?dl=0

So data get only obtains the html page of the dataset

Mikanebu commented 6 years ago

@MAliNaqvi Thanks for reporting this. We will investigate and fix it soon.

AcckiyGerman commented 6 years ago

This issue is probably related with https://github.com/datahq/data-cli/issues/246

Mikanebu commented 6 years ago

@AcckiyGerman This issue is fixed here https://github.com/datahq/data-cli/issues/266.

AcckiyGerman commented 6 years ago

TESTED: FAIL

  1. data get https://github.com/datasets/imf-weo - OK
  2. data get http://datahub.io/core/finance-vix - FAILED
    user@pc:~/work/datasets$ node ../datahq/data-cli/bin/data.js get http://datahub.io/core/finance-vixTime elapsed: 1.33 s
    Dataset/file is saved in "core/finance-vix"
    user@pc:~/work/datasets$ cd core/finance-vix/
    datapackage_zip/       vix-daily_csv/         vix-daily_json/        
    vix-daily/             vix-daily_csv_preview/ 
    user@pc:~/work/datasets$ cd core/finance-vix/vix-daily_csv
    vix-daily_csv/         vix-daily_csv_preview/ 
    user@pc:~/work/datasets$ cd core/finance-vix/vix-daily_csv/data/c1df2be3bc75e174ab6e2ceca7834192/
    user@pc:~/work/datasets/core/finance-vix/vix-daily_csv/data/c1df2be3bc75e174ab6e2ceca7834192$ ls
    vix-daily_csv.csv
    user@pc:~/work/datasets/core/finance-vix/vix-daily_csv/data/c1df2be3bc75e174ab6e2ceca7834192$ cat vix-daily_csv.csv 
    <?xml version="1.0" encoding="UTF-8"?>
    <Error><Code>NoSuchKey</Code><Message>The specified key does not exist.</Message><Key>core/finance-vix/latest/https://pkgstore.datahub.io/core/finance-vix/vix-daily_csv/data/c1df2be3bc75e174ab6e2ceca7834192/vix-daily_csv.csv</Key><RequestId>B30A53B2D7C18ED6</RequestId><HostId>V16M1k+JI/L/FgqZ9U02FW3CPxZrurRPGq+Kz76eew4ZymRYu3KsKWb2SU4k22aDtQChAWIx2v0=</HostId></Error>user@pc:~/work/datasets/core/finance-vix/vix-daily_csv/data/c1df2be3bc75e174ab6e2ceca7834192$ 
  3. data get in NTFS folder - Partially passed - the dataset is saved, but no data inside, like in the previous case

@Mikanebu

Mikanebu commented 6 years ago

@AcckiyGerman Please, pull the latest changes from data.js repo and try again.

AcckiyGerman commented 6 years ago

TESTED:

Issue is FIXED: @MAliNaqvi Thanks for report! Updated version will be published soon.

Mikanebu commented 6 years ago

data get does not work for any other users except core tried to run data get http://datahub.io/Mikanebu/push-speed-62-1mb

invalid json response body at http://pkgstore.datahub.io/Mikanebu/push-speed-62-1mb/latest/datapackage.json reason: Unexpected token < in JSON at position 0 The reason: if you take a look on pkg url, it uses username, but it should use userid

The solution: We need to call resolver API to get userid and pass it further

AcckiyGerman commented 6 years ago

Just tested with latest versions of data-cli, data.js, datahub-client Test Failed:

user@pc:~/work/datahq/data-cli$ node bin/data.js get https://datahub.io/AcckiyGerman/datahub-qa-issues-tracker
> Error! invalid json response body at https://pkgstore.datahub.io/a08d3588fbae0355042537595c65819d/datahub-qa-issues-tracker/latest/datapackage.json reason: Unexpected token < in JSON at position 0

How should I test it?

@Mikanebu Please provide some description, when moving issues to 'Review', like: Fixed in this PR, to test changes, do that and that (e.g. fetch this branch) Otherwise I can't test it properly, coz I don't see any PR in this issue, and don't know where is the fix.

Update ok I found it in the data.js repo, but test are still failed

Mikanebu commented 6 years ago

@AcckiyGerman Good catch first of all! 👍 Since we are now using revisionId instead of latest for generating pat, I fixed data.js library, parseDatasetIdentifier function that generates the path. FIXED in PR https://github.com/datahq/data.js/pull/30. Please, test and close it

AcckiyGerman commented 6 years ago

FAILED:

user@pc:~/work/datahq/data-cli$ node bin/data.js get http://datahub.io/AcckiyGerman/datahub-qa-issues-tracker
> Error! invalid json response body at http://pkgstore.datahub.io/a08d3588fbae0355042537595c65819d/datahub-qa-issues-tracker/latest/datapackage.json reason: Unexpected token < in JSON at position 0
user@pc:~/work/datahq/data-cli$ node bin/data.js get http://datahub.io/JohnSnowLabs/gdp-by-industry-and-country
> Error! invalid json response body at http://pkgstore.datahub.io/JohnSnowLabs/gdp-by-industry-and-country/latest/datapackage.json reason: Unexpected token < in JSON at position 0
user@pc:~/work/datahq/data-cli$ data get http://datahub.io/JohnSnowLabs/gdp-by-industry-and-country
> Error! invalid json response body at http://pkgstore.datahub.io/JohnSnowLabs/gdp-by-industry-and-country/latest/datapackage.json reason: Unexpected token < in JSON at position 0
user@pc:~/work/datahq/data-cli$ data get http://datahub.io/AcckiyGerman/datahub-qa-issues-tracker
> Error! invalid json response body at http://pkgstore.datahub.io/a08d3588fbae0355042537595c65819d/datahub-qa-issues-tracker/latest/datapackage.json reason: Unexpected token < in JSON at position 0
user@pc:~/work/datahq/data-cli$ data -v
0.7.4

As you see, I tried both the github version and the latest binary, but data get is not working for non-core datasets :(

anuveyatsu commented 6 years ago

@AcckiyGerman I suppose you're not using the latest changes from data.js library which should fix this. This change is not included in v0.7.4

AcckiyGerman commented 6 years ago

TESTED & FIXED

zelima commented 6 years ago

@AcckiyGerman tip: actual URLs would be more useful here, instead of saying tested on X dataset

AcckiyGerman commented 6 years ago

@Mikanebu again this bug :cry: While I could see this dataset on the datahub.io, I can not get it: http://datahub.io/examples/vega-views-tutorial-topojson

$ data get http://datahub.io/examples/vega-views-tutorial-topojson
> Error! invalid json response body at http://pkgstore.datahub.io/examples/vega-views-tutorial-topojson/1/datapackage.json reason: Unexpected token < in JSON at position 0
$ data -v
0.7.9

Tested with the binary from datahub.io/download

Mikanebu commented 6 years ago

@AcckiyGerman Republished all examples datasets. It has been 3 months since the last update. The problem happened with the upgraded backend, so we need to repush datasets if appears this error.

AcckiyGerman commented 6 years ago

Ok I will test& close as Duplicate

On Feb 22, 2018 14:38, "Meiran Zhiyenbayev" notifications@github.com wrote:

@AcckiyGerman https://github.com/acckiygerman Republished all examples datasets. It has been 3 months since the last update. The problem happened with the upgraded backend, so we need to repush datasets if appears this error.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/datahq/datahub-qa/issues/43#issuecomment-367667712, or mute the thread https://github.com/notifications/unsubscribe-auth/ALKlIhSnVK7mQXwHns672mq12LZDfyaLks5tXV-zgaJpZM4Ri4Mg .