Open VladimirAlexiev opened 4 years ago
Yes, that's right!
I have worried a bout about it too, but the way I've approached it is through coercions in the type system, which has recently entered Attean. There's the to_AtteanIRI
function in Types::Attean
. I've been using that for this purpose. I do agree though, it that it would be more elegant if it happened automatically, but I haven't had time to explore this further.
I'm not sure the code in URI/NamespaceMap.pm#L201 is relevant, IIRC, that is about turning an argument into a string, and in that case, Attean::IRI
should be supported, because it isa IRI
.
I'd love to see your use case
$model->subjects ($map->rdf->type, $map->owl->Ontology)
supported, because I do that all the time myself. I'm not sure how, but I tend to think it should be supported through the typing system and their coercions somehow. Would be happy to hear suggestions, I am certain open to change URI::NamespaceMap
too, but I feel it shouldn't have a runtime-requirement on Attean or Trine, etc.
IRI
compatible with URI
?Attean:IRI
from a model, how to apply eg $map->abbreviate()`* is `IRI` compatible with `URI`?
No, and there's where the actual disconnect is, and where I think we need to see if we can fix things.
* another case is: after getting an iterator of `Attean:IRI` from a model, how to apply eg $map->abbreviate()`
You should be able to use $map->abbreviate(to_Namespace($attean_iri))
, I think. I don't have tests for exactly that case, but I think it should work.
I use conversions like this:
sub iri ($) {
# convert string or URI (returned by URI::NamespaceMap $MAP) to Attean::IRI
my $uri = shift;
Attean::IRI->new (value => ref($uri) ? $uri->as_string : $uri, lazy => 1)
}
sub uri ($) {
my $iri = shift;
URI->new (ref($iri) ? $iri->as_string : $iri);
}
@kjetilk said in #152:
shouldn't need to define your own
iri
andIRI
functions. There's now anAtteanIRI
type inTypes::Attean
. Conventionally, these types has a function to convert by prepending withto_
. So you should be able to do just:use Types::Attean qw( to_AtteanIRI );
and then you should be able to use the
to_AtteanIRI
function for both these conversions and many more.
But:
to_AtteanIRI
take either string or can('as_string')
argument?lazy => 1
to speed up the conversion, because I don't need the IRI parsed into components. Can to_AtteanIRI
do this?uri()
to convert IRI->URI?lazy => 1
in URI->new
? Couldn't find any.Does to_AtteanIRI take either string or can('as_string') argument?
Yes, it can convert from a string too (not sure I understood the last part of the question).
Note that I use lazy => 1 to speed up the conversion, because I don't need the IRI parsed into components. Can to_AtteanIRI do this?
No, but I consider that the natural next step: https://github.com/kasei/perl-iri/issues/14
I think the main tension is between the IRI
class and URI
, because Attean::IRI
is a subclass of IRI
, and so a solution to this problem is likely to be more appropriate to be in IRI
. I think, perhaps @kasei can comment on that.
Note that lazy => 1
was introduces as a response to a performance problem I had with URI parsing, there is a lot of it going on for certain applications, so I think it is worthwhile looking into this problem.
I did some experiments, and @kasei did actually merge them, so there is code in IRI
that could be helpful:
https://github.com/kasei/perl-iri/pull/16
I ran out of time to explore this in the depth it needs. Not only do we need to pass the components back and forth, we also need to test it properly and benchmark it, but I haven't got the time.
Do I still need to define my function uri() to convert IRI->URI?
No, I don't think so.
Is there something like lazy => 1 in URI->new? Couldn't find any.
No, but I exploited that you can pass the components to URI
, so my idea is that we can use the methods to set them in both ends. It isn't given that it gives a performance boost, as that results in many subroutine calls, but it is interesting to see if it does.
@kjetilk :
Does to_AtteanIRI take can('as_string') argument?
I mean whether it can take a URI argument, and use ->as_string
my function uri() to convert IRI->URI? No, I don't think so.
If I use a stock to_AtteanIRI()
but also need to convert IRI->URI
(in order to put it in a NamespaceMap), I still need to define my own function for that?
you can pass the components to URI, so my idea is that we can use the methods to set them in both ends
I understand. But in some applications (eg semweb) you don't need to parse IRIs/URIs into components at all, so a lazy
option saves that effort.
- Note that I use
lazy => 1
to speed up the conversion, because I don't need the IRI parsed into components. Canto_AtteanIRI
do this?
Note that lazy
only defers the parsing of components. It doesn't avoid it altogether.
@kjetilk :
Does to_AtteanIRI take can('as_string') argument?
I mean whether it can take a URI argument, and use
->as_string
I'm still not sure I understand, could you provide a complete code example?
my function uri() to convert IRI->URI? No, I don't think so.
If I use a stock
to_AtteanIRI()
but also need to convertIRI->URI
(in order to put it in a NamespaceMap), I still need to define my own function for that?
No, there's a coercion for that in Types::Namespace
, so you should be able to use to_Namespace
for that.
All these use ->as_string
for the conversion, so if that's where the performance problem is, it won't help, but I think it would be where we should be solving the problem.
you can pass the components to URI, so my idea is that we can use the methods to set them in both ends
I understand. But in some applications (eg semweb) you don't need to parse IRIs/URIs into components at all, so a
lazy
option saves that effort.
It isn't always that unproblematic, as a pure string comparison might not be sufficient. RFC3986 has a section on comparison and often these issues jump out to bite us in semweb applications. Often, you need to parse and normalize the URI at some point in the process. It shouldn't be when you most need performance, but it requires an elaborate design at times.
I guess
URI::NamespaceMap
is the preferred way to make URIs from pnames of well-known namespaces, or namespaces harvested from an ingested RDF file. At leastbin/attean_query
uses that class.However,
URI::NamespaceMap
returnsURI
which is not compatible withAttean::IRI
. Eg if you try to use:Attean returns an error like this:
In
URI::NamespaceMap::_scrub_uri()
https://metacpan.org/release/URI-NamespaceMap/source/lib/URI/NamespaceMap.pm#L201 I see some code for compatibility with Trine, but not with Attean.@kasei or @kjetilk, could you take care of this?