datopian / datahub-qa

:package: Bugs, issues and suggestions for datahub.io
https://datahub.io/
32 stars 6 forks source link

Print proper error messages for "info" and "get" commands when URL is invalid #132

Closed anuveyatsu closed 6 years ago

anuveyatsu commented 6 years ago

If I try to "info" or "get" some invalid URL, I'd get following error message that doesn't tell me enough about what's wrong:

> Error! invalid json response body at https://api.datahub.io/source/null/undefined/successful reason: Unexpected token < in JSON at position 0

How to reproduce

Expected behavior

Mikanebu commented 6 years ago

@AcckiyGerman It is fixed in https://github.com/datahq/data-cli/issues/303. Please review

AcckiyGerman commented 6 years ago

TESTED - FAIL

invalid domain

Invalid url on the datahub

invalid url on the github

Error message

> Error! NOT FOUND - is not telling what is not found. Let's change to:

refactor

@zelima @akariv As I see we have remote connections code & error handlers spread all over the data-cli code. That leads to non-consistent error messages and a lot of try/catch everywhere.

WDYT if we refactor code to handle all the connections, checking URL and responce codes, related errors in one place? Something like Connection lib ? I know the part of this is a datahub-client lib, but as you see, we also should handle errors and mistakes related to github and other sites.

anuveyatsu commented 6 years ago

@AcckiyGerman to me > Error! NOT FOUND is sufficient error and it's coming from data.js library when it tries to read the descriptor from given URL.

AcckiyGerman commented 6 years ago

@anuveyatsu then we could fix it in the data.js.

User who get Error: NOT FOUND will think: What is actually not found, ha?:

Oh, this programm is too hard for me...

Mikanebu commented 6 years ago

@AcckiyGerman Are you suggesting to replace to > Error! Provided URL is invalid. Expected URL to a dataset or descriptor ?

akariv commented 6 years ago

@AcckiyGerman some centralizing is a good idea for sure. But the meaning of a 404 error is different based on context, so it should be handled in the caller and no the callee I think.

AcckiyGerman commented 6 years ago

@Mikanebu yes, change NOT FOUND to Provided URL is invalid. Expected URL to a dataset or descriptor in the data.js please. And this also need to be fixed:

$ node bin/data.js info http://no-site.io
File descriptor:
{
  "path": "http://no-site.io",
  "pathType": "remote",
  "name": "",
  "format": "",
  "encoding": "utf-8"
}
anuveyatsu commented 6 years ago

@AcckiyGerman @Mikanebu I think it depends on context, e.g., if it's happening in Dataset class or inparseDatasetIndentifier function then yes, otherwise in File class it should say that provided URL is invalid.

Re info command, I don't think we should do anything there. It acts just as parser there.

AcckiyGerman commented 6 years ago

@anuveyatsu we could just check the response code, if it is not 200-203 then return error instead of parsing the response.

anuveyatsu commented 6 years ago

@Mikanebu ok, can you please fix that and let's close the issue

Mikanebu commented 6 years ago

@AcckiyGerman FIXED, please test again. Data get and info for URL throws the following error messages

Meirans-MBP:data-cli Zhiyenbayev_mirza$  data info https://datahub.io
> Error! Provided URL is invalid. Expected URL to a dataset or descriptor.
Meirans-MBP:data-cli Zhiyenbayev_mirza$ data info http://NON-EXISTING-DOMAIN.in/
> Error! Connection error: getaddrinfo ENOTFOUND non-existing-domain.in non-existing-domain.in:80
AcckiyGerman commented 6 years ago

TESTED & FIXED User now get proper error, when provide invalid url to data get|info

AcckiyGerman commented 6 years ago

TEST ON GITHUB is not very good still:

Valid link - TEST is OK

user@Dima-PC:~$ data info https://raw.githubusercontent.com/frictionlessdata/test-data/master/LICENSE
File descriptor:
{
  "path": "https://raw.githubusercontent.com/frictionlessdata/test-data/master/LICENSE",
  "pathType": "remote",
  "name": "LICENSE",
  "format": "",
  "encoding": "utf-8"
}

Invalid link (return 404 status code, but also return a html page) - TEST FAIL

user@Dima-PC:~$ data info https://raw.githubusercontent.com/frictionlessdata/test-data/master/LICENSE_
File descriptor:
{
  "path": "https://raw.githubusercontent.com/frictionlessdata/test-data/master/LICENSE_",
  "pathType": "remote",
  "name": "LICENSE_",
  "format": "",
  "encoding": "utf-8"
}

But it is a very specific case - leaving issue closed