gfngfn / SATySFi

A statically-typed, functional typesetting system
GNU Lesser General Public License v3.0
1.18k stars 84 forks source link

New API to fix footnote duplication problem #266

Open yasuo-ozu opened 3 years ago

yasuo-ozu commented 3 years ago

We sometimes encounter the needs to evaluate inline-text or block-text more than one time. (For example, +xgenlisting command in satysfi-enumitem )

Evaluating inline-text twice occurs many problems, for example, counter of \footnote incremented twice: http://satysfi-playground.tech/permalink/71a660eee721e76a94ba063272874e37eafdff3970fa8ba95d6d923bc4efef32

To prevent this, I suggest new API to prevent such problems for commands using let-mutable.

Design

get-command-identity ctx: context -> string

it returns some hash value to identify the command which the ctx given for.

Using this API, the \footnote command (or FootnoteScheme) will be defined like:

let-mutable footnote-ref <- 0 in
let mutable footnote-dict <- [] in
let-inline ctx \footnote it = 
    %...
    let hash = get-command-identity ctx in
    if hash is in footnote-dict then
        footnote-ref <- !footnote-ref + 1
    else
        footnote-dict <- hash :: !footnote-dict
    in
    % ...

Example

For simple, the hash is generated from the number of byte location where the command is used in inline-text (determined when generating AST).

But in complicated cases, it will not work:

let-inline ctx \footnote-wrap it =
    read-inline ctx { 
        \footnote(#it;);  % (a) bytes from file head
    }
in
document '<
    +p {
        \footnote-wrap { hello } % (b) bytes from file head
        \footnote-wrap { world } % (c) bytes from file head
   }
>

Correct behavior of the code is displaying hello and world in footnote. So we want to make the hashes different. To solve this, make ctx contain the history of byte locations and calculete the hash as

yasuo-ozu commented 3 years ago

To be short: 現在のSATySFiでは、inline-text\footer等が含まれている場合、そのinline-textを2回評価すると、footerのカウントが2回インクリメントされてしまいます。

これを回避するために

get-command-identity ctx: context -> string

というAPIを提案します。これは、

let-inline ctx \footer-wrap it = %...
in
% ...
{
    \footer-wrap{ hello }
}

のように、inline-text中でコマンド(ここでは\footer-wrap)が使われるたびに、ファイル先頭からのバイト位置をctx中のリストに追加していきます。get-command-identity ctxが呼び出された時、このリストの内容を元にハッシュを生成します。

\footnote等、let-mutableな変数を更新するコマンドでは、get-command-identityが提供するハッシュ情報を登録する辞書を作成し、その辞書にハッシュが登録されていない場合のみmutable変数を更新するようにします。

elpinal commented 3 years ago

Is that a problem? Does it mean that all state-mutating commands that may be used inside +xgenlisting or something alike are forced to use get-command-identity if one wants to avoid unintended behavior? I consider using read-inline to the same inline-text twice itself as a problem. As for +xgenlisting, another extension to SATySFi might be needed to deal with state-mutating commands, but I argue that it is not a good solution to obligate providers of commands like \footnote to manage effects in such a sophisticated way.

gfngfn commented 3 years ago

Thank you for having a discussion (& sorry for the late response).

As to the language design for inline texts and inline box rows, I have a thought close to @elpinal -san's one. That is, I suppose that applying read-inline twice to the same inline texts itself is a somewhat problematic usage. Inline texts in general have effects of mutating states, and thus basically they can be regarded as “affine” resources (though this is not reflected in type-level restriction).

Certainly, I also feel a slight need to consider that there would be some case where inline texts are essentially required to be used more than once. For instance, consider the case where there’s more than one choice of how to render it : inline-text depending on the total size of the inline box rows resulting from it:

let-inline ctx \decorate it =
  let ib1 = read-inline (some-settings-1 ctx) it in
  let ib2 = read-inline (some-settings-2 ctx) it in
  if first-one-is-better (get-natural-metrics ib1) (get-natural-metrics ib2) then
    ib1
  else
    ib2

IMHO, however, adding primitives like get-command-identity seems to introduce too much complication to the semantics of the language. I feel that how to solve such a problem is rather in the scope of the language design than that of just adding primitives. For example, if SATySFi has a kind of state-passing semantics (like that of Elm or React) and is free from mutable references, one can safely implement the command above by:

let-inline state ctx \decorate it =
  let (state1, ib1) = read-inline state (some-settings-1 ctx) it in
  let (state2, ib2) = read-inline state (some-settings-2 ctx) it in
  if first-one-is-better (get-natural-metrics ib1) (get-natural-metrics ib2) then
    (state1, ib1)
  else
    (state2, ib2)

(though this tends to make code somewhat redundant.)

yasuo-ozu commented 3 years ago

Thanks for the discussion. I also agree with state-passing syntax, but it is very breakable change to the current syntax. For the first step to make mutable variables obsolete, I suggest the following syntax:

(Type.t is inspired by SATySFiでad hoc多相)

set-context-variable : string -> Type.t -> 'a -> context -> context
get-context-variable : string -> Type.t -> context -> 'a option
duplicate-context : context -> context
apply-context : context -> context -> ()

The goal of this syntax is to put all mutable variables inside of context.

Compared to current syntax

Compared to state-passing syntax

yasuo-ozu commented 3 years ago

If compositing state into context is unsound, How about replacing current SATySFi's context to ('a, context)? I think it is more compatible way to use state and context separately, and we do not have to add primitive like *-context-variable, duplicate-context and apply-context. However, this way cannot diminish let-mutable, because command provider like \footnote should manage its state corresponded to 'a

yasuo-ozu commented 3 years ago

For example, regarding SATySFi's context as (int list, old-context), we can implement mutable behavior using let-mutable like:

let-mutable identical-number <- 0 in
% duplicate-context : context -> context
let duplicate-context ctx =
    let (l, etc) = ctx in 
    let l = !identical-number :: l in
    let () = identical-number <- !identical-number + 1 in
    (l, etc)
in
let-mutable mutable-state <- Dict.make in
% get-context-variable : string -> context -> int option
let get-context-variable str ctx =
    let-rec inner l =
        match Dict.get(l, str) !mutable-state with
            | Some(r) -> Some(r)
            | None -> match l with
                | _ :: l -> inner l
                | _ -> None
    in
    let (l, _) = ctx in
    inner l
in
% set-context-variable : string -> int -> context -> ()
let set-context-variable str num ctx =
    let (l, _) = ctx in
    let () = mutable-state <- Dict.set (l, str) num !mutable-state in
    ()
in
let-inline ctx \footnote it =
    % ...
    let n = get-context-variable `footnote-number` ctx in
    let () = set-context-variable `footnote-number` ctx (n + 1) in
    % ...
let-inline ctx \eval-twice it =
  let tmp-ctx = duplicate-context ctx in
  let tmp-ib = read-inline tmp-ctx it in
  let measuring = get-natural-metrics tmp-ib in
  read-inline (some-settings measuring ctx) it
gfngfn commented 3 years ago

Thanks for additional suggestions. I have a few remarks, however:

gfngfn commented 3 years ago

(The following is a rough translation of the response above.)

さらに提案頂いてありがたいです.ただ,いくつか指摘したいことがありました:

yasuo-ozu commented 3 years ago

Thanks for reply.

The second suggestion is not backward-compatible

It can be compatible if the language restrict 'a to int list. In fact, in above example, 'a' is bound to int list.

Thanks for explanation of philosophy and I understood that context should immutable in language design. However, the state-passing example is redundant. Is there any other solution for this so far?