mthom / scryer-prolog

A modern Prolog implementation written mostly in Rust.
BSD 3-Clause "New" or "Revised" License
2.02k stars 117 forks source link

[Bug] load_html/3 from library(sgml) nonfunctional on macOS #2588

Open yewscion opened 1 day ago

yewscion commented 1 day ago

Description

Currently, when following https://www.metalevel.at/prolog/dcg and especially this section https://youtu.be/CvLsVfq6cks?t=1453 of the accompanying video, I ran across what I think is a bug in the sgml library (or something similar; I don't know rust, and it seems to be something in the rust code which is failing). Specifically, the load_html/3 predicate from the sgml library causes current scryer-prolog builds to terminate ungracefully. Logs below.

Logs

In case they are useful.

My system

General System Info

$ macchina

                  ,MMMM.           Host        -  crodriguez@DY49DGQL3H-7399.local
                .MMMMMM            Machine     -  Mac14,2
                MMMMM,             Kernel      -  24.0.0
      .;MMMMM:' MMMMMMMMMM;.       OS          -  macOS 15.0.0 Sequoia
    MMMMMMMMMMMMNWMMMMMMMMMMM:     DE          -  Aqua
  .MMMMMMMMMMMMMMMMMMMMMMMMWM.     WM          -  Quartz Compositor
  MMMMMMMMMMMMMMMMMMMMMMMMM.       Packages    -  235 (Homebrew), 17 (cargo)
 ;MMMMMMMMMMMMMMMMMMMMMMMM:        Shell       -  bash
 :MMMMMMMMMMMMMMMMMMMMMMMM:        Terminal    -  alacritty
 .MMMMMMMMMMMMMMMMMMMMMMMMM.       Brightness  -  48%
  MMMMMMMMMMMMMMMMMMMMMMMMMMM.     Resolution  -  3420x2224@60fps (as 1710x1112)
   .MMMMMMMMMMMMMMMMMMMMMMMMMM.    Uptime      -  14d 7h 37m
     MMMMMMMMMMMMMMMMMMMMMMMM      CPU         -  Apple M2 (8)
      ;MMMMMMMMMMMMMMMMMMMM.       CPU Load    -  255%
        .MMMM,.    .MMMM,.         Memory      -  3.9 GB / 8.4 GB
                                   Battery     -  8% & Discharging
                                   Disk Space  -  189.9 GB / 245.1 GB

Rustup info

$ rustup show
Default host: aarch64-apple-darwin
rustup home:  /Users/crodriguez/.local/opt/rustup

stable-aarch64-apple-darwin (default)

Expected (Using binary from 0.9.3)

$ ./scryer-prolog --version
126d7bb-modified
$ ./scryer-prolog
?- use_module(library(sgml)).
   true.
?- load_html("<html><head><title>Hello!</title></head></html>", Es, []).
   Es = [element(html,[],[element(head,[],[element(title,[],["Hello!"])]),element(body,[],[])])].
?-

Last Release (0.9.4)

$ ./target/release/scryer-prolog --version
v0.9.4
$ ./target/release/scryer-prolog
?- use_module(library(sgml)).
   true.
?- load_html("<html><head><title>Hello!</title></head></html>", Es, []).
   Es = [element(html,[],[element(head,[],[element(title,[],["Hello!"])]),element(body,[],[])])].
?-

Current HEAD (a2443d)

$ scryer-prolog --version
v0.9.4-186-g0552530b
$ scryer-prolog
?- use_module(library(sgml)).
   true.
?- load_html("<html><head><title>Hello!</title></head></html>", Es, []).
thread 'main' panicked at src/machine/system_calls.rs:8213:41:
called `Option::unwrap()` on a `None` value
note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace
$ RUST_BACKTRACE=full scryer-prolog
?- use_module(library(sgml)).
   true.
?- load_html("<html><head><title>Hello!</title></head></html>", Es, []).
thread 'main' panicked at src/machine/system_calls.rs:8213:41:
called `Option::unwrap()` on a `None` value
stack backtrace:
   0:        0x104f194cc - <std::sys::backtrace::BacktraceLock::print::DisplayBacktrace as core::fmt::Display>::fmt::h243268f17d714c7f
   1:        0x104afe444 - core::fmt::write::hb3cfb8a30e72d7ff
   2:        0x104eee97c - std::io::Write::write_fmt::hfb2314975de9ecf1
   3:        0x104f1d150 - std::panicking::default_hook::{{closure}}::h14c7718ccf39d316
   4:        0x104f1cba0 - std::panicking::default_hook::hc62e60da3be2f352
   5:        0x104f1e8b8 - std::panicking::rust_panic_with_hook::h09e8a656f11e82b2
   6:        0x104f1db20 - std::panicking::begin_panic_handler::{{closure}}::h1230eb3cc91b241c
   7:        0x104f1dabc - std::sys::backtrace::__rust_end_short_backtrace::hc3491307aceda2c2
   8:        0x104f1dab0 - _rust_begin_unwind
   9:        0x104f84b00 - core::panicking::panic_fmt::ha4b80a05b9fff47a
  10:        0x104f84c24 - core::panicking::panic::h298549a7412a7069
  11:        0x104f84ec4 - core::option::unwrap_failed::hb7af631ec4f78cd6
  12:        0x104d3eb74 - scryer_prolog::machine::system_calls::<impl scryer_prolog::machine::Machine>::html_node_to_term::hec0bda0cb247344f
  13:        0x104cdc5b8 - scryer_prolog::machine::Machine::run_module_predicate::hb245d28231a73d8a
  14:        0x104eb21d4 - scryer_prolog::run_binary::hfc907de46c55c216
  15:        0x104af000c - std::sys::backtrace::__rust_begin_short_backtrace::hf04dc65f0f5177e5
  16:        0x104af02f0 - _main
[24.0] {18:36} <(HEAD detached at v0.9.4)> :SVSD-Faculty: crodriguez@DY49DGQL3H-7399:scryer-prolog/$
bakaq commented 1 day ago

I can reproduce this on Linux.

bakaq commented 1 day ago

Ok, from bisecting it seems this happened in 260c52adec6e942. It seems that to properly migrate from select to scraper for HTML parsing we will need a bit more work. Also, this is a good indication that we need better (or at least some) tests for the libraries, because that commit breaks load_html/3 completely even for basic cases like this one, which would be caught even by a barebones test suite.

Skgland commented 22 hours ago

I will try to have a look at this, though this will probably not happen before Friday.

yewscion commented 12 hours ago

No rush, just didn't want to revert to 0.9.3 before reporting (to complete the exercises I am doing, and then I will use HEAD again so that I can continue to report bugs if I find them).