Closed anuveyatsu closed 6 years ago
@AcckiyGerman It is fixed in https://github.com/datahq/data-cli/issues/303. Please review
TESTED - FAIL
$ node bin/data.js get http://no-site.io
Error! Connection error: request to http://no-site.io failed, reason: getaddrinfo ENOTFOUND no-site.io no-site.io:80$ node bin/data.js info http://no-site.io
File descriptor:
{
"path": "http://no-site.io",
"pathType": "remote",
"name": "",
"format": "",
"encoding": "utf-8"
}$ node bin/data.js get http://datahub.io/core/not-exist
> Error! NOT FOUND$ node bin/data.js info http://datahub.io
> Error! NOT FOUND[x] $ node bin/data.js info http://github.com/datasets/not-exist
Error! Not Found: https://raw.githubusercontent.com/datasets/not-exist/master/datapackage.json
[x] $ node bin/data.js get http://github.com/datasets/not-exist
Error! Not Found: https://raw.githubusercontent.com/datasets/not-exist/master/datapackage.json
> Error! NOT FOUND
- is not telling what is not found. Let's change to:
> Error! No dataset found on the given URL
> Provided URL is invalid. Expected URL to a dataset or descriptor.
as it written in the Expected behavior@zelima @akariv As I see we have remote connections code & error handlers spread all over the data-cli
code. That leads to non-consistent error messages and a lot of try/catch everywhere.
WDYT if we refactor code to handle all the connections, checking URL and responce codes, related errors in one place? Something like Connection lib ?
I know the part of this is a datahub-client
lib, but as you see, we also should handle errors and mistakes related to github and other sites.
@AcckiyGerman to me > Error! NOT FOUND
is sufficient error and it's coming from data.js
library when it tries to read the descriptor from given URL.
@anuveyatsu then we could fix it in the data.js
.
User who get Error: NOT FOUND
will think:
What is actually not found, ha?:
Oh, this programm is too hard for me...
@AcckiyGerman Are you suggesting to replace to > Error! Provided URL is invalid. Expected URL to a dataset or descriptor
?
@AcckiyGerman some centralizing is a good idea for sure. But the meaning of a 404 error is different based on context, so it should be handled in the caller and no the callee I think.
@Mikanebu yes, change NOT FOUND
to Provided URL is invalid. Expected URL to a dataset or descriptor
in the data.js
please.
And this also need to be fixed:
$ node bin/data.js info http://no-site.io
File descriptor:
{
"path": "http://no-site.io",
"pathType": "remote",
"name": "",
"format": "",
"encoding": "utf-8"
}
@AcckiyGerman @Mikanebu I think it depends on context, e.g., if it's happening in Dataset
class or inparseDatasetIndentifier
function then yes, otherwise in File
class it should say that provided URL is invalid.
Re info command, I don't think we should do anything there. It acts just as parser there.
@anuveyatsu we could just check the response code, if it is not 200-203
then return error instead of parsing the response.
@Mikanebu ok, can you please fix that and let's close the issue
@AcckiyGerman FIXED, please test again. Data get and info for URL throws the following error messages
Meirans-MBP:data-cli Zhiyenbayev_mirza$ data info https://datahub.io
> Error! Provided URL is invalid. Expected URL to a dataset or descriptor.
Meirans-MBP:data-cli Zhiyenbayev_mirza$ data info http://NON-EXISTING-DOMAIN.in/
> Error! Connection error: getaddrinfo ENOTFOUND non-existing-domain.in non-existing-domain.in:80
TESTED & FIXED
User now get proper error, when provide invalid url to data get|info
Valid link - TEST is OK
user@Dima-PC:~$ data info https://raw.githubusercontent.com/frictionlessdata/test-data/master/LICENSE
File descriptor:
{
"path": "https://raw.githubusercontent.com/frictionlessdata/test-data/master/LICENSE",
"pathType": "remote",
"name": "LICENSE",
"format": "",
"encoding": "utf-8"
}
Invalid link (return 404 status code, but also return a html page) - TEST FAIL
user@Dima-PC:~$ data info https://raw.githubusercontent.com/frictionlessdata/test-data/master/LICENSE_
File descriptor:
{
"path": "https://raw.githubusercontent.com/frictionlessdata/test-data/master/LICENSE_",
"pathType": "remote",
"name": "LICENSE_",
"format": "",
"encoding": "utf-8"
}
But it is a very specific case - leaving issue closed
If I try to "info" or "get" some invalid URL, I'd get following error message that doesn't tell me enough about what's wrong:
How to reproduce
data info https://datahub.io
data get https://datahub.io
Expected behavior
Provided URL is invalid. Expected URL to a dataset or descriptor.
data info http://NON-EXISTING-DOMAIN.in/
data info http://datahub.io/core/non-existing-dataset
domain is real, but web server returns non-success status code (all except200-203
)