r-lib / httr

httr: a friendly http package for R
https://httr.r-lib.org
Other
988 stars 1.99k forks source link

httr::parse_url not parsing AWS s3 uri correctly #659

Closed DyfanJones closed 1 year ago

DyfanJones commented 4 years ago

Hi all,

When working with AWS S3 uri format I have noticed that httr::parse_url isn't parsing it as expected. A standard s3 uri follows the following format: s3://mybucket/path/to/file. When parsed I expect it to be:

# $scheme
# s3
# 
# $hostname
# mybucket
# 
# $hostname
# NULL
#
# $path
# path/to/file
# 

However I get the following

s3_uri = "s3://mybucket/path/to/file"

(parsed1 = httr::parse_url(s3_uri))
# $scheme
# NULL
# 
# $hostname
# NULL
# 
# $port
# NULL
# 
# $path
# [1] "s3://mybucket/path/to/file"
# 
# $query
# NULL
# 
# $params
# NULL
# 
# $fragment
# NULL
# 
# $username
# NULL
# 
# $password
# NULL
# 
# attr(,"class")
# [1] "url"

When using the urltools::url_parse

parsed2 = urltools::url_parse(s3_uri)
# scheme   domain   port         path          parameter fragment
# s3       mybucket <NA>         path/to/file      <NA>     <NA>

Similar with python's urllib:

import urllib

s3_uri = "s3://mybucket/path/to/file"

urllib.parse.urlparse(s3_uri)
>>> ParseResult(scheme='s3', netloc='mybucket', path='/path/to/file', params='', query='', fragment='')
hadley commented 1 year ago

httr has been superseded in favour of httr2, so is no longer under active development. If this problem is still important to you in httr2, I'd suggest filing an issue offer there 😄. Thanks for using httr!