libwww-perl / URI

The Perl URI module
https://metacpan.org/pod/URI
Other
41 stars 48 forks source link
hacktoberfest

NAME

URI - Uniform Resource Identifiers (absolute and relative)

SYNOPSIS

use URI ();

$u1 = URI->new("http://www.example.com");
$u2 = URI->new("foo", "http");
$u3 = $u2->abs($u1);
$u4 = $u3->clone;
$u5 = URI->new("HTTP://WWW.example.com:80")->canonical;

$str = $u->as_string;
$str = "$u";

$scheme = $u->scheme;
$opaque = $u->opaque;
$path   = $u->path;
$frag   = $u->fragment;

$u->scheme("ftp");
$u->host("ftp.example.com");
$u->path("cpan/");

DESCRIPTION

This module implements the URI class. Objects of this class represent "Uniform Resource Identifier references" as specified in RFC 2396 (and updated by RFC 2732).

A Uniform Resource Identifier is a compact string of characters that identifies an abstract or physical resource. A Uniform Resource Identifier can be further classified as either a Uniform Resource Locator (URL) or a Uniform Resource Name (URN). The distinction between URL and URN does not matter to the URI class interface. A "URI-reference" is a URI that may have additional information attached in the form of a fragment identifier.

An absolute URI reference consists of three parts: a scheme, a scheme-specific part and a fragment identifier. A subset of URI references share a common syntax for hierarchical namespaces. For these, the scheme-specific part is further broken down into authority, path and query components. These URIs can also take the form of relative URI references, where the scheme (and usually also the authority) component is missing, but implied by the context of the URI reference. The three forms of URI reference syntax are summarized as follows:

<scheme>:<scheme-specific-part>#<fragment>
<scheme>://<authority><path>?<query>#<fragment>
<path>?<query>#<fragment>

The components into which a URI reference can be divided depend on the scheme. The URI class provides methods to get and set the individual components. The methods available for a specific URI object depend on the scheme.

CONSTRUCTORS

The following methods construct new URI objects:

COMMON METHODS

The methods described in this section are available for all URI objects.

Methods that give access to components of a URI always return the old value of the component. The value returned is undef if the component was not present. There is generally a difference between a component that is empty (represented as "") and a component that is missing (represented as undef). If an accessor method is given an argument, it updates the corresponding component in addition to returning the old value of the component. Passing an undefined argument removes the component (if possible). The description of each accessor method indicates whether the component is passed as an escaped (percent-encoded) or an unescaped string. A component that can be further divided into sub-parts are usually passed escaped, as unescaping might change its semantics.

The common methods available for all URI are:

GENERIC METHODS

The following methods are available to schemes that use the common/generic syntax for hierarchical namespaces. The descriptions of schemes below indicate which these are. Unrecognized schemes are assumed to support the generic syntax, and therefore the following methods:

SERVER METHODS

For schemes where the authority component denotes an Internet host, the following methods are available in addition to the generic methods.

SCHEME-SPECIFIC SUPPORT

Scheme-specific support is provided for the following URI schemes. For URI objects that do not belong to one of these, you can only use the common and generic methods.

CONFIGURATION VARIABLES

The following configuration variables influence how the class and its methods behave:

ENVIRONMENT VARIABLES

BUGS

There are some things that are not quite right:

PARSING URIs WITH REGEXP

As an alternative to this module, the following (official) regular expression can be used to decode a URI:

my($scheme, $authority, $path, $query, $fragment) =
$uri =~ m|(?:([^:/?#]+):)?(?://([^/?#]*))?([^?#]*)(?:\?([^#]*))?(?:#(.*))?|;

The URI::Split module provides the function uri_split() as a readable alternative.

SEE ALSO

URI::file, URI::WithBase, URI::Escape, URI::Split, URI::Heuristic

RFC 2396: "Uniform Resource Identifiers (URI): Generic Syntax", Berners-Lee, Fielding, Masinter, August 1998.

http://www.iana.org/assignments/uri-schemes

http://www.iana.org/assignments/urn-namespaces

http://www.w3.org/Addressing/

COPYRIGHT

Copyright 1995-2009 Gisle Aas.

Copyright 1995 Martijn Koster.

This program is free software; you can redistribute it and/or modify it under the same terms as Perl itself.

AUTHORS / ACKNOWLEDGMENTS

This module is based on the URI::URL module, which in turn was (distantly) based on the wwwurl.pl code in the libwww-perl for perl4 developed by Roy Fielding, as part of the Arcadia project at the University of California, Irvine, with contributions from Brooks Cutter.

URI::URL was developed by Gisle Aas, Tim Bunce, Roy Fielding and Martijn Koster with input from other people on the libwww-perl mailing list.

URI and related subclasses was developed by Gisle Aas.