kasei / attean

A Perl Semantic Web Framework
19 stars 10 forks source link

spacepad redefined at ShowStuff (AtteanX/Parser/JSONLD) #153

Closed VladimirAlexiev closed 4 years ago

VladimirAlexiev commented 4 years ago
$ time perl ../owl2soml.pl -voc schema schema.jsonld > schema2.yaml
Subroutine spacepad redefined at C:/Strawberry/perl/site/lib/Debug/ShowStuff.pm line 1635.
        require Debug/ShowStuff.pm called at C:/Strawberry/perl/site/lib/JSONLD.pm line 57
        JSONLD::BEGIN() called at C:/Strawberry/perl/site/lib/Debug/ShowStuff.pm line 1635
        eval {...} called at C:/Strawberry/perl/site/lib/Debug/ShowStuff.pm line 1635
        require JSONLD.pm called at C:/Strawberry/perl/site/lib/AtteanX/Parser/JSONLD.pm line 143
        AtteanX::Parser::JSONLD::BEGIN() called at C:/Strawberry/perl/site/lib/Debug/ShowStuff.pm line 1635
        eval {...} called at C:/Strawberry/perl/site/lib/Debug/ShowStuff.pm line 1635
        require AtteanX/Parser/JSONLD.pm called at C:/Strawberry/perl/lib/Module/Load.pm line 77
        eval {...} called at C:/Strawberry/perl/lib/Module/Load.pm line 77
        Module::Load::_load("AtteanX::Parser::JSONLD") called at C:/Strawberry/perl/lib/Module/Load/Conditional.pm line 457
        eval {...} called at C:/Strawberry/perl/lib/Module/Load/Conditional.pm line 457
        Module::Load::Conditional::can_load("modules", HASH(0x62dbba8)) called at C:/Strawberry/perl/site/lib/Attean.pm line 188
        Attean::get_parser("Attean", "filename", "schema.jsonld") called at ../owl2soml.pl line 189
        main::load_ontologies("schema.jsonld") called at ../owl2soml.pl line 314

The two files are:

I have the latest versions:

cpanm AtteanX::Parser::JSONLD
Fetching http://www.cpan.org/authors/id/G/GW/GWILLIAMS/AtteanX-Parser-JSONLD-0.001.tar.gz ... Fetching http://www.cpan.org/authors/id/I/IS/ISHIGAKI/JSON-4.02.tar.gz ... OK
Fetching http://www.cpan.org/authors/id/G/GW/GWILLIAMS/JSONLD-0.002.tar.gz ... OK
kasei commented 4 years ago

I've seen a couple of cases of this popping up in the CPAN Testers results, but haven't been able to figure out what's causing it. It seems to be entirely outside of the JSONLD code. Is the code succeeding (noisily), or is this failing to run to completion?

kasei commented 4 years ago

OK. I did a bit more digging. This is an issue with String::Util 1.25+ and Debug::ShowStuff. String::Util 1.25 introduced an exportable spacepad function, and Debug::ShowStuff imports everything from String::Util.

I think the solution to this is probably to remove the use of Debug::ShowStuff from JSONLD. It's critical in working toward complete coverage of the JSON-LD spec, but I can try to remove its use from releases that end up on CPAN.

kasei commented 4 years ago

Filed bug report.

VladimirAlexiev commented 4 years ago

Is the code succeeding (noisily)

A very good question! I'll let it run to completion since it takes 5 min on Turtle.

time perl ../owl2soml.pl -voc schema schema.ttl    > schema1.yaml
real    4m9.203s
user    0m0.000s
sys     0m0.094s

This is the largest ontology I've tested my tool on (508k ttl, 730k rdf, 808k jsonld; results in 428k yaml) so it takes substantial time to process.

BTW should I post another issue for optimizing this run time? My code doesn't use SPARQL (for now), just Model -> subjects/properties/objects/holds and Iter -> next/elements. If so, what's the easiest way to profile?

I suspect significant time is spent converting between Attean::IRI and URI (the two are not compatible). I use lazy to suspend IRI parsing, but there is no such option for URI

sub iri ($) {
  # convert string or URI (returned by URI::NamespaceMap $MAP) to Attean::IRI
  my $uri = shift or return;
  Attean::IRI->new (value => ref($uri) ? $uri->as_string : $uri, lazy => 1)
}

sub uri ($) {
  my $iri = shift;
  URI->new (ref($iri) ? $iri->as_string : $iri);
}
VladimirAlexiev commented 4 years ago

schema.jsonld fails because it's missing namespaces (posted a bug).

However, I got the same warning, then it proceeded to completion:

Subroutine spacepad redefined at C:/Strawberry/perl/site/lib/Debug/ShowStuff.pm line 1635.
        require Debug/ShowStuff.pm called at C:/Strawberry/perl/site/lib/JSONLD.pm line 57
        JSONLD::BEGIN() called at C:/Strawberry/perl/site/lib/Debug/ShowStuff.pm line 1635
        eval {...} called at C:/Strawberry/perl/site/lib/Debug/ShowStuff.pm line 1635
        require JSONLD.pm called at C:/Strawberry/perl/site/lib/AtteanX/Parser/JSONLD.pm line 143
        AtteanX::Parser::JSONLD::BEGIN() called at C:/Strawberry/perl/site/lib/Debug/ShowStuff.pm line 1635
        eval {...} called at C:/Strawberry/perl/site/lib/Debug/ShowStuff.pm line 1635
        require AtteanX/Parser/JSONLD.pm called at C:/Strawberry/perl/lib/Module/Load.pm line 77
        eval {...} called at C:/Strawberry/perl/lib/Module/Load.pm line 77
        Module::Load::_load("AtteanX::Parser::JSONLD") called at C:/Strawberry/perl/lib/Module/Load/Conditional.pm line 457
        eval {...} called at C:/Strawberry/perl/lib/Module/Load/Conditional.pm line 457
        Module::Load::Conditional::can_load("modules", HASH(0x65c7a58)) called at C:/Strawberry/perl/site/lib/Attean.pm line 188
        Attean::get_parser("Attean", "filename", "extraction.ttl") called at ../../owl2soml/owl2soml.pl line 190
        main::load_ontologies("extraction.ttl", "skos-fix.ttl", "skos.rdf") called at ../../owl2soml/owl2soml.pl line 315

@kasei why would it get to Module::Load::_load("AtteanX::Parser::JSONLD") when processing "extraction.ttl" ? Does it try each parser in turn until success? In random order? I think it'd be better to have a central registry ext-mimeType-parser class, and dispatch from there?

VladimirAlexiev commented 4 years ago

I thought I can work around this with no warnings 'redefine', no luck:

sub load_ontologies(@) {
  no warnings 'redefine'; # quash "Subroutine spacepad redefined"
  while (my $data = shift) {
    # my $base  = Attean::IRI->new('http://example.org/');
    open(my $fh, '<:encoding(UTF-8)', $data) or my_die "can't open $data: $!";
    my $pclass  = Attean->get_parser(filename => $data) // 'AtteanX::Parser::Turtle';
    my $parser  = $pclass->new(namespaces => $map); # base => $base
    my $iter    = $parser->parse_iter_from_io($fh);
    my $quads   = $iter->as_quads($graph);
    $model->add_iter($quads);
  }
}

Neither this way:

{  no warnings 'redefine'; # quash "Subroutine spacepad redefined"
  sub load_ontologies(@) {
    ...
  }
}

nor putting it at top level:

use warnings;
no warnings 'redefine'; # quash https://github.com/kasei/attean/issues/153 "Subroutine spacepad redefined"
kasei commented 4 years ago

The spacepad warnings should resolve with JSONLD v0.004. I'd be happy to see any issues you might want to file about profiling/performance, though can't guarantee I'll be able to resolve them :)