plk / biber

Backend processor for BibLaTeX
Artistic License 2.0
336 stars 37 forks source link

[discussion] Static Perl compilation without PAR + make optional / drop more dependencies #338

Closed vadimkantorov closed 2 years ago

vadimkantorov commented 4 years ago

Hi! I'm building yet-another-attempt-of-latex-in-the-browser: https://vadimkantorov.github.io/busytext/busytex.html. So far I managed to build xetex+bibtex8.

I was thinking to somehow try to "compile" biber into WebAssembly. I was considering two options:

  1. perlcc to generate C code
  2. WebPerl to replace PAR packing

What would be your thoughts on feasibility of these two variants? Have you tried using perlcc on biber? Are there any "nasty" required dependencies?

Does biber require popen / signals / web requests for normal functioning? If not, it probably can be made working in WASM.

plk commented 4 years ago

I would be interested in helping to look into this. perlcc could work as it's an established system. For WebPerl, biber does use http requests to fetch external data sources and this might be an issue I suppose. Another potential issue is the btparse library that is used by Text::BibTeX as this is very old and I don't know how this would play with modern frameworks like this. I do some minor signal messing about in biber to contain some segfaults from libbtparse for example.

vadimkantorov commented 4 years ago

Cool :) I'll post here my building attempts

vadimkantorov commented 4 years ago

As part of prep work, I'm working on compiling cperl (a small Perl compiler that supports perlcc in a more stable way) statically with all modules in order to run tlmgr from C code (a smaller Perl codebase than biber). If that works, that could be an alternative way to WebPerl (WebPerl does same for regular Perl), and it would also run natively.

If you're interested to also work on this, feel free to slide into https://github.com/perl11/cperl/issues/423#issuecomment-706597439

vadimkantorov commented 4 years ago

An update:

I wanted to see if one can get a fully static build (Emscripten has a dynamic linking support, but it's new; and a static build is valuable on its own for embedding in other programs).

  1. I first checked if cperl + perlcc are able to produce a fully static build. At the moment they cannot, the advancement is blocked: https://github.com/perl11/cperl/issues/423
  2. I failed to install RPerl that might also work - CPAN installs was taking too much time and producing too many errors on my system
  3. https://metacpan.org/pod/distribution/App-Staticperl/staticperl.pod may work, needs to be checked (it configures the main perl distro for a static build)
  4. WebPerl approach of getting a static build - probably similar to staticperl, may work outside of WebAssembly context as well
vadimkantorov commented 4 years ago

@plk What is the minimum set of biber's dependencies for a minimial local-only working setup? (without downloading stuff from internet, without running external programs) Full list from Build.pl: autovivification Class::Accessor Data::Dump Data::Compare Data::Uniqid DateTime::Format::Builder DateTime::Calendar::Julian File::Slurper IPC::Cmd IPC::Run3 List::AllUtils List::MoreUtils List::MoreUtils::XS Mozilla::CA Regexp::Common Log::Log4perl Unicode::Collate Unicode::Normalize Unicode::LineBreak Unicode::GCString Encode::Locale Encode::EUCJPASCII Encode::JIS2K Encode::HanExtra Parse::RecDescent PerlIO::utf8_strict XML::LibXML XML::LibXML::Simple XML::LibXSLT XML::Writer Sort::Key Storable Text::CSV Text::CSV_XS Text::Roman IO::String URI Text::BibTeX LWP::UserAgent LWP::Protocol::https Business::ISBN Business::ISSN Business::ISMN Lingua::Translit

plk commented 4 years ago

That is the minimum dependencies you have listed there ...

vadimkantorov commented 4 years ago

E.g. Mozilla::CA or LWP should not be needed in theory if consulting internet is disabled, right?

Same for IPC. Does biber require running external programs for minimal functioning?

plk commented 4 years ago

Well, I'd have to see whether things would work without them as their integration isn't really that modular as it was never designed to be used without them.

vadimkantorov commented 3 years ago

Okay! In the meanwhile I'll try to build it with all dependencies. but for sure it'd be more robust for these cross-compilation scenarios (and have smaller binary size!) if there is a mode "no-calling-external-programs + no-downloading-from-internet" that enables running even if LWP/IPC are not installed/present

vadimkantorov commented 3 years ago

Do you know how to skip tests during ./Build installdeps? Doing CPAN_OPTS=-T ./Build installdeps does not help. Otherwise installing all dependencies is very slow.

It may be that CPAN_OPTS handling is broken in cpan. So a switch in ./Build or ./Build.pl to enable cpan -T instead of cpan would be nice.

vadimkantorov commented 3 years ago

That is the minimum dependencies you have listed there ...

Why does the https://github.com/plk/biber/blob/dev/dist/linux_x86_64/build.sh PAR run command contain much fewer dependencies? namely, only

  --module=Pod::Simple::TranscodeSmart \
  --module=Pod::Simple::TranscodeDumb \
  --module=List::MoreUtils::XS \
  --module=List::SomeUtils::XS \
  --module=List::MoreUtils::PP \
  --module=HTTP::Status \
  --module=HTTP::Date \
  --module=Encode:: \
  --module=File::Find::Rule \
  --module=IO::Socket::SSL \
  --module=IO::String \
  --module=PerlIO::utf8_strict \
  --module=Text::CSV_XS \
  --module=DateTime \
vadimkantorov commented 3 years ago

Also, I'm trying to set up a Github Actions build script in https://github.com/vadimkantorov/buildbiber/blob/master/.github/workflows/build.yml

Maybe it would be good for biber to have it for some testing as well in the main repo

plk commented 3 years ago

Do you know how to skip tests during ./Build installdeps? Doing CPAN_OPTS=-T ./Build installdeps does not help. Otherwise installing all dependencies is very slow.

It may be that CPAN_OPTS handling is broken in cpan. So a switch in ./Build or ./Build.pl to enable cpan -T instead of cpan would be nice.

Did you try ./Build installdeps --cpan_client 'cpan -T'

vadimkantorov commented 3 years ago

Did you try ./Build installdeps --cpan_client 'cpan -T'

Not yet. Will try!

plk commented 3 years ago

That is the minimum dependencies you have listed there ...

Why does the https://github.com/plk/biber/blob/dev/dist/linux_x86_64/build.sh PAR run command contain much fewer dependencies? namely, only

  --module=Pod::Simple::TranscodeSmart \
  --module=Pod::Simple::TranscodeDumb \
  --module=List::MoreUtils::XS \
  --module=List::SomeUtils::XS \
  --module=List::MoreUtils::PP \
  --module=HTTP::Status \
  --module=HTTP::Date \
  --module=Encode:: \
  --module=File::Find::Rule \
  --module=IO::Socket::SSL \
  --module=IO::String \
  --module=PerlIO::utf8_strict \
  --module=Text::CSV_XS \
  --module=DateTime \

It does depend on the platform - it's somewhat empirical, you have to tweak the build script until the executable works in terms of module includes. This is because the dependency checking modules are quite sensitive to platforms and the idiosyncracies of particular perl builds. If in doubt, use a module line to explicitly import - it can't hurt. I usually start with a basic set from a similar OS and then add as needed which I test the binary and see that it didn't auto-detect pack a module that's needed.

plk commented 3 years ago

Regarding testing, I haven't looked this in github as it required setting up a perl environment etc. and that's probably not trivial ...

vadimkantorov commented 3 years ago

They have an Ubuntu installation, but without a perl distribution pre-installed.

I opted for compiling one from scratch (since going forward I'd like to replace it by a custom statically-built WebAssembly one):

name: build

on: workflow_dispatch

env:
  URL: https://www.cpan.org/src/5.0/perl-5.32.0.tar.gz
  MAKEFLAGS: -j2

jobs:
  build:
    runs-on: ubuntu-20.04
    steps:
       - name: Install Perl
         run: |
           echo Downloading and compiling in [$PWD] from [$URL]

           mkdir perl
           wget -nc $URL
           tar -xf $(basename $URL) --strip-components=1 --directory=perl

           pushd perl
           bash +x ./Configure -sde -Dprefix="$PWD/../prefix"
           test -f Makefile
           make
           make install
           popd

       - name: Install Biber Dependencies Without Test
         run: |
           BEFORE=$(find ./prefix | wc -l)
           ./prefix/bin/cpan -T Module::Build Config::AutoConf ExtUtils::LibBuilder    autovivification Class::Accessor Data::Dump Data::Compare Data::Uniqid DateTime::Format::Builder DateTime::Calendar::Julian File::Slurper IPC::Cmd IPC::Run3 List::AllUtils List::MoreUtils List::MoreUtils::XS Mozilla::CA Regexp::Common Log::Log4perl Unicode::Collate Unicode::Normalize Unicode::LineBreak Unicode::GCString Encode::Locale Encode::EUCJPASCII Encode::JIS2K Encode::HanExtra Parse::RecDescent PerlIO::utf8_strict XML::LibXML XML::LibXML::Simple XML::LibXSLT XML::Writer Sort::Key Storable Text::CSV Text::CSV_XS Text::Roman IO::String URI Text::BibTeX LWP::UserAgent LWP::Protocol::https Business::ISBN Business::ISSN Business::ISMN Lingua::Translit
           AFTER=$(find ./prefix | wc -l)
           echo files before: $BEFORE after: $AFTER
vadimkantorov commented 3 years ago

Running https://github.com/vadimkantorov/buildbiber/blob/master/.github/workflows/build.yml, essentially:

# perl and cpan are built from sources

cpan -T Test::More Test::Differences File::Which    Module::Build    Config::AutoConf ExtUtils::LibBuilder    autovivification Class::Accessor Data::Dump Data::Compare Data::Uniqid DateTime::Format::Builder DateTime::Calendar::Julian File::Slurper IPC::Cmd IPC::Run3 List::AllUtils List::MoreUtils List::MoreUtils::XS Mozilla::CA Regexp::Common Log::Log4perl Unicode::Collate Unicode::Normalize Unicode::LineBreak Unicode::GCString Encode::Locale Encode::EUCJPASCII Encode::JIS2K Encode::HanExtra Parse::RecDescent PerlIO::utf8_strict XML::LibXML XML::LibXML::Simple XML::LibXSLT XML::Writer Sort::Key Storable Text::CSV Text::CSV_XS Text::Roman IO::String URI Text::BibTeX LWP::UserAgent LWP::Protocol::https Business::ISBN Business::ISSN Business::ISMN Lingua::Translit

URLBIBER=https://github.com/plk/biber/archive/v2.15.tar.gz
URLTESTFILES=https://master.dl.sourceforge.net/project/biblatex-biber/biblatex-biber/testfiles
BIBERTESTFILES="test.bib test.bcf test-dev.bcf unifont.ttf"

mkdir biber
wget $URLBIBER
tar -xf $(basename $URLBIBER) --strip-components=1 --directory biber

pushd biber
perl ./Build.PL
perl ./Build install

wget $(printf "$URLTESTFILES/%s " $BIBERTESTFILES)
perl ./Build test
perl ./bin/biber --validate-control --convert-control test

produced a lot of sortinithash mismatches. Is there a way to install the specific version of Unicode::Collate to get rid of them?

I attach the full test log log.txt

plk commented 3 years ago

What version of perl and U::C is on the test box?

vadimkantorov commented 3 years ago

I install there perl 5.32.0 from sources in https://github.com/Perl/perl5/tree/v5.32.0

I could easily upgrade the perl version there if needed by just specifying a different version

vadimkantorov commented 3 years ago

I don't know what U::C version it installs from CPAN. If a particular U::C version is required, I can check if possible to install a specific version from CPAN

plk commented 3 years ago

Do you know what version of Unicode::Collate is there?

plk commented 3 years ago

I see - I think that my install is not using the latest U::C - let me update and update the test results and then let's see. It would be best to currently use perl 5.30 for the tests until I migrate all the builds/tests to 5.32. Ah, I see the sortinithash issues with U::C 1.29 will fix in the tests in DEV branch and update here.

vadimkantorov commented 3 years ago

In all likelihood it's:

./prefix/bin/cpan -D Unicode::Collate
Reading '/home/vadimkantorov/.cpan/Metadata'
  Database was generated on Sat, 28 Nov 2020 14:17:03 GMT
Unicode::Collate
-------------------------------------------------------------------------
        (no description)
        S/SA/SADAHIRO/Unicode-Collate-1.29.tar.gz
        /home/vadimkantorov/perl/../prefix/lib/5.32.0/x86_64-linux/Unicode/Collate.pm
        Installed: 1.27
        CPAN:      1.29  Not up to date
        SADAHIRO Tomoyuki (SADAHIRO)
        SADAHIRO@cpan.org
vadimkantorov commented 3 years ago

If it matters, the latest release at https://github.com/Perl/perl5/releases is 5.33.4

plk commented 3 years ago

I never use odd numbers as they tend to be considered experimental. 5.32 would be the latest stable version.

vadimkantorov commented 3 years ago

My github workflow file doesn't do PAR packing, since I didn't need it, but if it's useful for a single beginner/simple/free_CI build script, please feel free to build upon it

plk commented 3 years ago

DEV branch is updated with requirement for U::C 1.29 and all tests should have the correct sortinithashes for this now.

vadimkantorov commented 3 years ago

Can PAR output the final list of resolved module dependencies? I'm advancing with compiling WebPerl, so a full list of modules to try bundling with Perl at build time would be very useful.

plk commented 3 years ago

Have a look at the docs for Module::ScanDeps - that's what PAR uses to resolve dependencies. The -module lines in the par script are simply there to catch things that this module can't detect (usually due to runtime includes etc.)

vadimkantorov commented 3 years ago

Instead of a binary, I produced a zipball with PAR and dumped its file listing: log.txt

Here is a listing of archive's lib directory:

vadimkantorov@DESKTOP-4UF8FID:~/buildbiber/foo$ ls lib
AutoLoader.pm  Config_heavy.pl  Errno.pm       IO.pm    Number          SelfLoader.pm  Try                  constant.pm       strict.pm
AutoSplit.pm   Cwd.pm           Eval           IPC      Opcode.pm       Socket.pm      URI                  deprecate.pm      subs.pm
B              DBD              Exception      JSON     POSIX.pm        Sort           URI.pm               feature.pm        threads
B.pm           DBI              Exporter       LWP      Package         Specio         Unicode              if.pm             threads.pm
Biber          DBI.pm           Exporter.pm    LWP.pm   PadWalker.pm    Specio.pm      Variable             integer.pm        unicore
Biber.pm       Data             ExtUtils       Lingua   Params          Storable.pm    XML                  locale.pm         utf8.pm
Business       Date             Fcntl.pm       List     Parse           Sub            XSLoader.pm          meta_notation.pm  utf8_heavy.pl
CPAN           DateTime         File           Locale   PerlIO          Symbol.pm      _charnames.pm        mro.pm            vars.pm
Carp           DateTime.pm      FileHandle.pm  Log      PerlIO.pm       Sys            attributes.pm        namespace         version
Carp.pm        Devel            Getopt         MIME     Pod             Term           auto                 overload          version.pm
Class          Digest           HTML           MRO      Regexp          Test           autovivification.pm  overload.pm       warnings
Clone.pm       Dist             HTTP           Math     Role            Test2          base.pm              overloading.pm    warnings.pm
Compress       DynaLoader.pm    Hash           Module   Safe.pm         Text           bytes.pm             parent.pm
Config.pm      Encode           I18N           Mozilla  Scalar          Tie            bytes_heavy.pl       re.pm
Config_git.pl  Encode.pm        IO             Net      SelectSaver.pm  Time           charnames.pm         sigtrap.pm
vadimkantorov commented 3 years ago

I will try to have a static perl build with just these modules (for native and for wasm).

If you can make optional LWP/internet access/process launching (and corresponding module imports), it would greatly increase the chance of success.

vadimkantorov commented 3 years ago

Here're the shared library that PAR packs:

   102568  2020-11-29 23:44   lib/auto/B/B.so
    55928  2020-11-30 09:45   lib/auto/Class/XSAccessor/XSAccessor.so
    17824  2020-11-30 09:39   lib/auto/Clone/Clone.so
    86912  2020-11-29 23:44   lib/auto/Compress/Raw/Bzip2/Bzip2.so
   138920  2020-11-29 23:44   lib/auto/Compress/Raw/Zlib/Zlib.so
    22888  2020-11-29 23:44   lib/auto/Cwd/Cwd.so
  1394840  2020-11-30 09:53   lib/auto/DBD/SQLite/SQLite.so
   142408  2020-11-30 09:52   lib/auto/DBI/DBI.so
    45000  2020-11-29 23:44   lib/auto/Data/Dumper/Dumper.so
    21344  2020-11-30 09:48   lib/auto/DateTime/DateTime.so
     8352  2020-11-30 09:42   lib/auto/Devel/Caller/Caller.so
     8504  2020-11-30 09:42   lib/auto/Devel/LexAlias/LexAlias.so
    22472  2020-11-29 23:44   lib/auto/Digest/MD5/MD5.so
   410240  2020-11-30 09:53   lib/auto/Encode/Byte/Byte.so
  2218008  2020-11-30 09:53   lib/auto/Encode/CN/CN.so
    47696  2020-11-30 09:53   lib/auto/Encode/EBCDIC/EBCDIC.so
   789464  2020-11-30 09:54   lib/auto/Encode/EUCJPASCII/EUCJPASCII.so
    55512  2020-11-30 09:53   lib/auto/Encode/Encode.so
 12054800  2020-11-30 09:54   lib/auto/Encode/HanExtra/HanExtra.so
  2512056  2020-11-30 09:54   lib/auto/Encode/JIS2K/JIS2K.so
  2868608  2020-11-30 09:53   lib/auto/Encode/JP/JP.so
  2556016  2020-11-30 09:53   lib/auto/Encode/KR/KR.so
    63824  2020-11-30 09:53   lib/auto/Encode/Symbol/Symbol.so
  2149592  2020-11-30 09:53   lib/auto/Encode/TW/TW.so
    22208  2020-11-30 09:53   lib/auto/Encode/Unicode/Unicode.so
    21944  2020-11-29 23:44   lib/auto/Fcntl/Fcntl.so
     8336  2020-11-29 23:44   lib/auto/File/DosGlob/DosGlob.so
    32880  2020-11-29 23:44   lib/auto/File/Glob/Glob.so
    54416  2020-11-30 09:50   lib/auto/HTML/Parser/Parser.so
    23240  2020-11-29 23:44   lib/auto/Hash/Util/FieldHash/FieldHash.so
    17704  2020-11-29 23:44   lib/auto/I18N/Langinfo/Langinfo.so
    23576  2020-11-29 23:44   lib/auto/IO/IO.so
    26696  2020-11-29 23:44   lib/auto/IPC/SysV/SysV.so
   143416  2020-11-30 09:41   lib/auto/List/MoreUtils/XS/XS.so
   104752  2020-11-30 09:49   lib/auto/List/SomeUtils/XS/XS.so
    63608  2020-11-29 23:44   lib/auto/List/Util/Util.so
    17760  2020-11-29 23:44   lib/auto/MIME/Base64/Base64.so
    18096  2020-11-29 23:44   lib/auto/Math/BigInt/FastCalc/FastCalc.so
    27776  2020-11-29 23:44   lib/auto/Opcode/Opcode.so
   120872  2020-11-29 23:44   lib/auto/POSIX/POSIX.so
    32584  2020-11-30 09:46   lib/auto/Package/Stash/XS/XS.so
    22904  2020-11-30 09:42   lib/auto/PadWalker/PadWalker.so
    48992  2020-11-30 09:49   lib/auto/Params/Validate/XS/XS.so
    33000  2020-11-29 23:44   lib/auto/PerlIO/encoding/encoding.so
    18856  2020-11-29 23:44   lib/auto/PerlIO/scalar/scalar.so
    18424  2020-11-30 09:39   lib/auto/PerlIO/utf8_strict/utf8_strict.so
    48704  2020-11-29 23:44   lib/auto/Socket/Socket.so
    32200  2020-11-30 09:56   lib/auto/Sort/Key/Key.so
   114912  2020-11-29 23:44   lib/auto/Storable/Storable.so
    12816  2020-11-30 09:45   lib/auto/Sub/Identify/Identify.so
     8568  2020-11-29 23:44   lib/auto/Sys/Hostname/Hostname.so
    22384  2020-11-29 23:44   lib/auto/Sys/Syslog/Syslog.so
    36968  2020-11-30 09:57   lib/auto/Text/BibTeX/BibTeX.so
    70072  2020-11-30 09:53   lib/auto/Text/CSV_XS/CSV_XS.so
    36000  2020-11-29 23:44   lib/auto/Time/HiRes/HiRes.so
  1585904  2020-11-30 09:54   lib/auto/Unicode/Collate/Collate.so
   165816  2020-11-30 09:44   lib/auto/Unicode/LineBreak/LineBreak.so
   614176  2020-11-29 23:44   lib/auto/Unicode/Normalize/Normalize.so
    37856  2020-11-30 09:45   lib/auto/Variable/Magic/Magic.so
   412944  2020-11-30 09:55   lib/auto/XML/LibXML/LibXML.so
    62784  2020-11-30 09:56   lib/auto/XML/LibXSLT/LibXSLT.so
    83104  2020-11-30 09:50   lib/auto/XML/Parser/Expat/Expat.so
    13592  2020-11-29 23:44   lib/auto/attributes/attributes.so
    28928  2020-11-30 09:39   lib/auto/autovivification/autovivification.so
    23520  2020-11-29 23:44   lib/auto/mro/mro.so
   603744  2020-11-29 23:44   lib/auto/re/re.so
     7752  2020-11-29 23:44   lib/auto/threads/shared/shared.so
     7744  2020-11-29 23:44   lib/auto/threads/threads.so
    82360  2017-08-31 19:21   shlib/x86_64-linux/libbtparse.so.1
  2917216  2019-11-12 17:58   shlib/x86_64-linux/libcrypto.so.1.1
    87912  2019-10-22 14:52   shlib/x86_64-linux/libexslt.so.0
   577312  2019-11-12 17:58   shlib/x86_64-linux/libssl.so.1.1
  1834232  2020-02-05 18:08   shlib/x86_64-linux/libxml2.so.2
   247952  2019-10-22 14:52   shlib/x86_64-linux/libxslt.so.1
   116960  2017-05-23 13:32   shlib/x86_64-linux/libz.so.1
    33000  2020-11-29 23:44   lib/auto/PerlIO/encoding/encoding.so
    14232  2020-11-29 23:44   lib/auto/PerlIO/mmap/mmap.so
    18856  2020-11-29 23:44   lib/auto/PerlIO/scalar/scalar.so
    28552  2020-11-29 23:44   lib/auto/PerlIO/via/via.so

I'll see if perl can build these modules statically at compile time

vadimkantorov commented 3 years ago

i.e. packages with shared libraries in the PAR-file are below. So probably it makes sense to start with them for a static_ext Perl build.

@plk Are there any that strike you as unnecessary? Since I'm going to manually find all dependencies, we could get rid of some unneeded ones.

[ok] B
[ok] Class::XSAccessor
[ok] Clone
[ok] Compress::Raw::Bzip2
[ok] Compress::Raw::Zlib
[ok] Cwd
DBD::SQLite
[ok] DBI
[ok] Data::Dumper
[ok] DateTime
[ok] Devel::Caller
[ok] Devel::LexAlias
[ok] Digest::MD5
[ok] Encode::Byte
[ok] Encode::CN
[ok] Encode::EBCDIC
[ok] Encode::EUCJPASCII
[ok] Encode
[ok] Encode::HanExtra
[ok] Encode::JIS2K
[ok] Encode::JP
[ok] Encode::KR
[ok] Encode::Symbol
[ok] Encode::TW
[ok] Encode::Unicode
[ok] Fcntl
[ok] File::DosGlob
[ok] File::Glob
[ok] HTML::Parser
[ok] Hash::Util::FieldHash
[ok] I18N::Langinfo
[ok] IO
[ok] IPC::SysV
List::MoreUtils::XS
List::SomeUtils::XS
[ok] List::Util
[ok] MIME::Base64
[ok] Math::BigInt::FastCalc
[ok] Opcode
[ok] Posix
[ok] Package::Stash::XS
[ok] PadWalker
Params::Validate::XS
[ok] PerlIO::utf8_strict
[ok] Socket
[ok] Sort::Key
[ok] Storable
[ok] Sub::Identify
[ok] Sys::Hostname
[ok] Sys::Syslog
Text:BibTeX
[ok] Text::CSV_XS
[ok] Time::HiRes
[ok] Unicode::Collate
[ok] Unicode::LineBreak
[ok] Unicode::Normalize
[ok] Variable::Magic
XML::LibXML
[ok] XML::LibXSLT
[ok] XML::Parser
[ok] attributes
[ok] autovivification
[ok] mro
[ok] re
[ok] threads::shared
[ok] threads
[ok] PerlIO::encoding
[ok] PerlIO::mmap
[ok] PerlIO::scalar
[ok] PerlIO::via
aterenin commented 3 years ago

Hi both

I wanted to make you aware that changes made in support of WebAssembly could also be of great help for Biber+Tectonic integration, which right now is difficult due to interop complexity. Most people are using this upstream by invoking Biber themselves in their IDE of choice in addition to Tectonic. Integration is being discussed in the following issue.

https://github.com/tectonic-typesetting/tectonic/issues/35

Being able to build without internet functionality could well be appreciated for this use case, in line with the compilation reproducibility goals of Tectonic.

Thanks again for your work here!

vadimkantorov commented 3 years ago

@aterenin I made some attempts at compiling Perl statically with all the dependencies. It almost worked, but I don't have time to continue this fight at the moment. If you'are interested to hear about my advancements, ping me at @vadimkantorov on Telegram or on vadimkantorov@gmail.com :)

On the side note, I also lazily develop https://busytex.github.io - completely client-side wasm TexLive compiler and basic integration with github (~3k lines for Makefiles and JavaScript). Let know if you wish to collaborate.

plk commented 2 years ago

Will revisit in future is there is sufficient interest.

vadimkantorov commented 8 months ago

I've retried again. Here is the list of PAR-discovered dependencies:

./shlib/x86_64-linux/libcrypto.so.3
./shlib/x86_64-linux/libexslt.so.0
./shlib/x86_64-linux/libbtparse.so
./shlib/x86_64-linux/libssl.so.3
./shlib/x86_64-linux/libxslt.so.1
./shlib/x86_64-linux/libz.so.1
./shlib/x86_64-linux/libxml2.so.2
+ ./lib/auto/I18N/Langinfo/Langinfo.so
+ ./lib/auto/Digest/MD5/MD5.so
./lib/auto/Sort/Key/Key.so
./lib/auto/Compress/Raw/Zlib/Zlib.so
./lib/auto/Compress/Raw/Bzip2/Bzip2.so
+ ./lib/auto/Encode/Encode.so
./lib/auto/Encode/Unicode/Unicode.so
./lib/auto/Encode/CN/CN.so
./lib/auto/Encode/KR/KR.so
./lib/auto/Encode/EUCJPASCII/EUCJPASCII.so
./lib/auto/Encode/Symbol/Symbol.so
./lib/auto/Encode/EBCDIC/EBCDIC.so
./lib/auto/Encode/Byte/Byte.so
./lib/auto/Encode/TW/TW.so
./lib/auto/Encode/HanExtra/HanExtra.so
./lib/auto/Encode/JP/JP.so
./lib/auto/Encode/JIS2K/JIS2K.so
./lib/auto/autovivification/autovivification.so
./lib/auto/Devel/Caller/Caller.so
./lib/auto/Devel/LexAlias/LexAlias.so
./lib/auto/XML/LibXSLT/LibXSLT.so
./lib/auto/XML/Parser/Expat/Expat.so
./lib/auto/XML/LibXML/LibXML.so
./lib/auto/Unicode/Normalize/Normalize.so
./lib/auto/Unicode/LineBreak/LineBreak.so
./lib/auto/Unicode/Collate/Collate.so
./lib/auto/IPC/SysV/SysV.so
./lib/auto/Clone/Clone.so
+ ./lib/auto/B/B.so
./lib/auto/Text/CSV_XS/CSV_XS.so
./lib/auto/Text/BibTeX/BibTeX.so
./lib/auto/PadWalker/PadWalker.so
+ ./lib/auto/PerlIO/encoding/encoding.so
+ ./lib/auto/PerlIO/scalar/scalar.so
+ ./lib/auto/PerlIO/via/via.so
./lib/auto/PerlIO/utf8_strict/utf8_strict.so
./lib/auto/PerlIO/mmap/mmap.so
./lib/auto/MIME/Base64/Base64.so
./lib/auto/HTML/Parser/Parser.so
./lib/auto/File/DosGlob/DosGlob.so
+ ./lib/auto/File/Glob/Glob.so
./lib/auto/threads/shared/shared.so
./lib/auto/threads/threads.so
./lib/auto/Time/HiRes/HiRes.so
./lib/auto/Sys/Syslog/Syslog.so
./lib/auto/Sys/Hostname/Hostname.so
+ ./lib/auto/Opcode/Opcode.so
+ ./lib/auto/Data/Dumper/Dumper.so
+ ./lib/auto/attributes/attributes.so
./lib/auto/IO/Compress/Brotli/Brotli.so
+ ./lib/auto/IO/IO.so
./lib/auto/List/MoreUtils/XS/XS.so
./lib/auto/List/SomeUtils/XS/XS.so
+ ./lib/auto/List/Util/Util.so
./lib/auto/Filter/Util/Call/Call.so
./lib/auto/DBI/DBI.so
./lib/auto/Net/SSLeay/SSLeay.so
./lib/auto/re/re.so
+ ./lib/auto/Fcntl/Fcntl.so
./lib/auto/Socket/Socket.so
./lib/auto/Sub/Identify/Identify.so
./lib/auto/DateTime/DateTime.so
+ ./lib/auto/mro/mro.so
./lib/auto/Storable/Storable.so
./lib/auto/Variable/Magic/Magic.so
./lib/auto/Math/BigInt/FastCalc/FastCalc.so
./lib/auto/Class/XSAccessor/XSAccessor.so
./lib/auto/Hash/Util/FieldHash/FieldHash.so
./lib/auto/DBD/SQLite/SQLite.so
./lib/auto/Package/Stash/XS/XS.so
+ ./lib/auto/Cwd/Cwd.so
./lib/auto/Params/Validate/XS/XS.so
./lib/auto/Params/Util/Util.so
+ ./lib/auto/POSIX/POSIX.so
vadimkantorov commented 7 months ago

@plk Btw I managed to build statically all these *.so package dependencies with a simple enough script

So there might be path towards building statically all of biber with musl - without any packer utils.

Maybe a remaining question for more portability is: does biber use backticks/system shell calls? If so, for portability it would be better to replace them with native Perl function calls or at least localize all such system shell calls in one perl source file to ease inspection and replacing them with more portable Perl code in the future.

This can also open path to providing biber as a library (if needed)

plk commented 7 months ago

No, no backticks. Any such thing is done with OS neutral perl modules by design.

vadimkantorov commented 7 months ago

This is great news, as I found that tlmgr.pl / install-tl.pl are using backticks all over the place...

plk commented 7 months ago

SInce biber has to run on WIndows, Mac and Linix, a lot of effort was put in to make sure it played nicely with all of them in a non hacky way ....

vadimkantorov commented 1 month ago

@plk I succeded in compiling a fully static variant of Perl (without any .so) - https://github.com/vadimkantorov/perlpack

I'll at some point try to use it instead of PAR to compile/pack a biber - without any temporary file extraction and so on, everything is embedded in a read-only, virtual FS (including *.pm files and any other files needed)