GrammaticalFramework / gf-core

Grammatical Framework core: compiler, shell & runtimes
https://www.grammaticalframework.org
Other
131 stars 35 forks source link

"Majestic" PGF runtime #130

Open johnjcamilleri opened 3 years ago

johnjcamilleri commented 3 years ago

@krangelov has mentioned at the summer school his ideas on completely reimagining the PGF format as a database. @anka-213 has expressed his interest in contributing, and I (@johnjcamilleri) also think I could be of use here. I'm creating this issue to be the home of that discussion.

krangelov commented 3 years ago
  • pgf_function_prob returns positive infinity for a non-existent function; wouldn't 0 or some error make more sense? Maybe the runtime doesn't need to change, but I feel like the bindings should at least handle this better (e.g. raising KeyError in Python)

The probability is in the logarithmic domain. I return INFINITY because this is what you get from - log 0.

  • pgf_category_prob seems to always return 0, regardless of whether the category exists or not. I added two (failing) test cases in Haskell for this.

In order to get something that is not zero you need to compile the grammar with an actual .probs file. There is nothing wrong with pgf_category_prob.

  • In the Haskell probability tests you are just checking for equality with pi, but because of floating-point imprecision I am having to use math.isclose() in Python. In createFunction I am reading the probability with PyArg_ParseTuple(args, "f", &prob) into a prob_t and in functionProbability I am converting it back with PyFloat_FromDouble((double)prob), which seems logical to me. I don't understand why it works in Haskell though.

You get it because the variable prob in the test_transactions.py is of type double while the runtime is using float internally. After the conversion from double to float you get a rounding. In Haskell I used the type double and then everything is fine. As far as I know Python doesn't have separate types for Float and Double like C and Haskell do. Maybe the solution with math.isclose is not that bad.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/GrammaticalFramework/gf-core/issues/130#issuecomment-927704931, or unsubscribe https://github.com/notifications/unsubscribe-auth/AEYFSZCGXIPVKFCB7VW5QKLUEA4ZDANCNFSM5BZSYMKQ . Triage notifications on the go with GitHub Mobile for iOS https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675 or Android https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub.

johnjcamilleri commented 3 years ago

In order to get something that is not zero you need to compile the grammar with an actual .probs file.

But how come the behaviour is different for non-existant functions and categories?

In Haskell I used the type double and then everything is fine.

I guess you mean Float here.

krangelov commented 3 years ago

On Mon, 27 Sept 2021 at 14:08, John J. Camilleri @.***> wrote:

In order to get something that is not zero you need to compile the grammar with an actual .probs file.

But how come the behaviour is different for non-existant functions and categories?

I see it now. Function pgf_category_prob should return INFINITY for non-existant categories as well. I fixed that.

In Haskell I used the type double and then everything is fine.

I guess you mean Float here.

Yes. I meant Float.

johnjcamilleri commented 3 years ago

So far I've been doing everything inside Docker. Today I tried to install the runtime directly in macOS and had some problems. I'm following the old instructions, namely:

$ glibtoolize
$ autoreconf -i
$ ./configure
$ make
$ make install

And the make step gives me:

/Applications/Xcode.app/Contents/Developer/usr/bin/make  all-am
  CXX      pgf/libpgf_la-db.lo
In file included from pgf/db.cxx:6:
In file included from pgf/data.h:58:
pgf/db.h:115:26: error: unknown type name 'thread_local'
extern PGF_INTERNAL_DECL thread_local DB_scope *last_db_scope __attribute__((tls_model("initial-exec")));
                         ^
pgf/db.h:115:47: error: expected ';' after top level declarator
extern PGF_INTERNAL_DECL thread_local DB_scope *last_db_scope __attribute__((tls_model("initial-exec")));
                                              ^
                                              ;
In file included from pgf/db.cxx:6:
In file included from pgf/data.h:60:
pgf/vector.h:11:16: error: a space is required between consecutive right angle brackets (use '> >')
ref<PgfVector<A>> vector_new(size_t len)
               ^~
               > >
pgf/vector.h:13:20: error: a space is required between consecutive right angle brackets (use '> >')
    ref<PgfVector<A>> res = PgfDB::malloc<PgfVector<A>>(len*sizeof(A));
                   ^~
                   > >
pgf/vector.h:13:54: error: a space is required between consecutive right angle brackets (use '> >')
    ref<PgfVector<A>> res = PgfDB::malloc<PgfVector<A>>(len*sizeof(A));
                                                     ^~
                                                     > >
pgf/vector.h:22:43: error: a space is required between consecutive right angle brackets (use '> >')
    ref<C> res = PgfDB::malloc<PgfVector<A>>(offset+len*sizeof(A)).as_object()-offset;
                                          ^~
                                          > >
pgf/vector.h:28:35: error: a space is required between consecutive right angle brackets (use '> >')
ref<A> vector_elem(ref<PgfVector<A>> v, size_t index)
                                  ^~
                                  > >
In file included from pgf/db.cxx:6:
In file included from pgf/data.h:61:
pgf/namespace.h:23:19: warning: alias declarations are a C++11 extension [-Wc++11-extensions]
using Namespace = ref<Node<V>>;
                  ^
pgf/namespace.h:23:29: error: a space is required between consecutive right angle brackets (use '> >')
using Namespace = ref<Node<V>>;
                            ^~
                            > >
In file included from pgf/db.cxx:6:
In file included from pgf/data.h:62:
pgf/expr.h:33:23: error: a space is required between consecutive right angle brackets (use '> >')
        ref<PgfVector<PgfHypo>> hypos;
                             ^~
                             > >
pgf/expr.h:34:26: error: a space is required between consecutive right angle brackets (use '> >')
    ref<PgfVector<PgfExpr>> exprs;
                         ^~
                         > >
pgf/expr.h:283:44: error: a space is required between consecutive right angle brackets (use '> >')
void pgf_context_free(ref<PgfVector<PgfHypo>> hypos);
                                           ^~
                                           > >
In file included from pgf/db.cxx:6:
pgf/data.h:82:34: error: a space is required between consecutive right angle brackets (use '> >')
    ref<PgfVector<ref<PgfEquation>>> defns;
                                 ^~
                                 > > 
pgf/data.h:92:23: error: a space is required between consecutive right angle brackets (use '> >')
        ref<PgfVector<PgfHypo>> context;
                             ^~
                             > >
pgf/db.cxx:301:35: error: use of undeclared identifier 'errno'
            throw pgf_systemerror(errno, filepath);
                                  ^
pgf/db.cxx:305:24: error: use of undeclared identifier 'errno'
            int code = errno;
                       ^
pgf/db.cxx:314:28: error: use of undeclared identifier 'errno'
                int code = errno;
                           ^
pgf/db.cxx:329:20: error: use of undeclared identifier 'errno'
        int code = errno;
                   ^
pgf/db.cxx:334:15: error: use of undeclared identifier 'pthread_rwlock_init'
    int res = pthread_rwlock_init(&rwlock, NULL);
              ^
pgf/db.cxx:337:20: error: use of undeclared identifier 'errno'
        int code = errno;
                   ^
fatal error: too many errors emitted, stopping now [-ferror-limit=]
1 warning and 20 errors generated.
make[1]: *** [pgf/libpgf_la-db.lo] Error 1
make: *** [all] Error 2

I've tried updating/reinstalling all the dependencies, and using git clean to clear up any old generated files from src/runtime/c, but it has no effect. I'm using the latest macOS 11.6, the output of g++ --version gives:

Configured with: --prefix=/Applications/Xcode.app/Contents/Developer/usr --with-gxx-include-dir=/Applications/Xcode.app/Contents/Developer/Platforms/MacOSX.platform/Developer/SDKs/MacOSX.sdk/usr/include/c++/4.2.1
Apple clang version 13.0.0 (clang-1300.0.29.3)
Target: x86_64-apple-darwin20.6.0
Thread model: posix
InstalledDir: /Applications/Xcode.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/bin

A bit of searching makes me think I need to put the flag -std=c++11 somewhere, but I don't really understand the build system used here and don't know where to put it.

krangelov commented 3 years ago

MacOS is using CLang while on Ubuntu I use GCC. They are not completely compatible. Maybe the thread_local pragma is not the same with CLang.

I don't know whether -std=c++11 will help but you can try it. In Makefile.am there is this line:

libpgf_la_CXXFLAGS = -fno-rtti

try replacing it with:

libpgf_la_CXXFLAGS = -fno-rtti -std=c++11

That option works for me also, so if it helps, then just commit the change.

I know that the runtime will not compile on Windows either, but I will look into that when everything else is stable.

On Fri, 8 Oct 2021 at 10:12, John J. Camilleri @.***> wrote:

So far I've been doing everything inside Docker. Today I tried to install the runtime directly in macOS and had some problems. I'm following the old instructions, namely:

$ glibtoolize $ autoreconf -i $ ./configure $ make $ make install

And the make step gives me:

/Applications/Xcode.app/Contents/Developer/usr/bin/make all-am CXX pgf/libpgf_la-db.lo In file included from pgf/db.cxx:6: In file included from pgf/data.h:58: pgf/db.h:115:26: error: unknown type name 'thread_local' extern PGF_INTERNAL_DECL thread_local DB_scope last_db_scope attribute((tls_model("initial-exec"))); ^ pgf/db.h:115:47: error: expected ';' after top level declarator extern PGF_INTERNAL_DECL thread_local DB_scope last_db_scope attribute((tls_model("initial-exec"))); ^ ; In file included from pgf/db.cxx:6: In file included from pgf/data.h:60: pgf/vector.h:11:16: error: a space is required between consecutive right angle brackets (use '> >') ref<PgfVector> vector_new(size_t len) ^~

pgf/vector.h:13:20: error: a space is required between consecutive right angle brackets (use '> >') ref<PgfVector> res = PgfDB::malloc<PgfVector>(len*sizeof(A)); ^~

pgf/vector.h:13:54: error: a space is required between consecutive right angle brackets (use '> >') ref<PgfVector> res = PgfDB::malloc<PgfVector>(len*sizeof(A)); ^~

pgf/vector.h:22:43: error: a space is required between consecutive right angle brackets (use '> >') ref res = PgfDB::malloc<PgfVector>(offset+len*sizeof(A)).as_object()-offset; ^~

pgf/vector.h:28:35: error: a space is required between consecutive right angle brackets (use '> >') ref vector_elem(ref<PgfVector> v, size_t index) ^~

In file included from pgf/db.cxx:6: In file included from pgf/data.h:61: pgf/namespace.h:23:19: warning: alias declarations are a C++11 extension [-Wc++11-extensions] using Namespace = ref<Node>; ^ pgf/namespace.h:23:29: error: a space is required between consecutive right angle brackets (use '> >') using Namespace = ref<Node>; ^~

In file included from pgf/db.cxx:6: In file included from pgf/data.h:62: pgf/expr.h:33:23: error: a space is required between consecutive right angle brackets (use '> >') ref<PgfVector> hypos; ^~

pgf/expr.h:34:26: error: a space is required between consecutive right angle brackets (use '> >') ref<PgfVector> exprs; ^~

pgf/expr.h:283:44: error: a space is required between consecutive right angle brackets (use '> >') void pgf_context_free(ref<PgfVector> hypos); ^~

In file included from pgf/db.cxx:6: pgf/data.h:82:34: error: a space is required between consecutive right angle brackets (use '> >') ref<PgfVector<ref>> defns; ^~

pgf/data.h:92:23: error: a space is required between consecutive right angle brackets (use '> >') ref<PgfVector> context; ^~

pgf/db.cxx:301:35: error: use of undeclared identifier 'errno' throw pgf_systemerror(errno, filepath); ^ pgf/db.cxx:305:24: error: use of undeclared identifier 'errno' int code = errno; ^ pgf/db.cxx:314:28: error: use of undeclared identifier 'errno' int code = errno; ^ pgf/db.cxx:329:20: error: use of undeclared identifier 'errno' int code = errno; ^ pgf/db.cxx:334:15: error: use of undeclared identifier 'pthread_rwlock_init' int res = pthread_rwlock_init(&rwlock, NULL); ^ pgf/db.cxx:337:20: error: use of undeclared identifier 'errno' int code = errno; ^ fatal error: too many errors emitted, stopping now [-ferror-limit=] 1 warning and 20 errors generated. make[1]: [pgf/libpgf_la-db.lo] Error 1 make: [all] Error 2

I've tried updating/reinstalling all the dependencies, and using git clean to clear up any old generated files from src/runtime/c, but it has no effect. I'm using the latest macOS 11.6, the output of g++ --version gives:

g++ --version Configured with: --prefix=/Applications/Xcode.app/Contents/Developer/usr --with-gxx-include-dir=/Applications/Xcode.app/Contents/Developer/Platforms/MacOSX.platform/Developer/SDKs/MacOSX.sdk/usr/include/c++/4.2.1 Apple clang version 13.0.0 (clang-1300.0.29.3) Target: x86_64-apple-darwin20.6.0 Thread model: posix InstalledDir: /Applications/Xcode.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/bin

A bit of searching makes me think I need to put the flag -std=c++11 somewhere, but I don't really understand the build system used here and don't know where to put it.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/GrammaticalFramework/gf-core/issues/130#issuecomment-938438128, or unsubscribe https://github.com/notifications/unsubscribe-auth/AEYFSZBJSQ2FADTMNJD27CLUF2RWTANCNFSM5BZSYMKQ . Triage notifications on the go with GitHub Mobile for iOS https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675 or Android https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub.

johnjcamilleri commented 3 years ago

Thanks for the pointer. It made some difference in that I no longer get these kinds of errors:

pgf/data.h:92:23: error: a space is required between consecutive right angle brackets (use '> >')
        ref<PgfVector<PgfHypo>> context;
                             ^~
                             > >

But I still get the others, and some new ones. No matter, I understand this is lower priority and it's not blocking me from working either. But I thought I might as well report how it went.

krangelov commented 3 years ago

It looks like thread_local is supported by the open-source CLang but Apple added it only with XCode 8. Can you try what they suggest here:

https://stackoverflow.com/questions/28094794/why-does-apple-clang-disallow-c11-thread-local-when-official-clang-supports

It would be nice to know that Mac OS is supported.

On Fri, 8 Oct 2021 at 15:26, John J. Camilleri @.***> wrote:

Thanks for the pointer. It made some difference in that I no longer get these kinds of errors:

pgf/data.h:92:23: error: a space is required between consecutive right angle brackets (use '> >') ref<PgfVector> context; ^~

But I still get the others, and some new ones. No matter, I understand this is lower priority and it's not blocking me from working either. But I thought I might as well report how it went.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/GrammaticalFramework/gf-core/issues/130#issuecomment-938641973, or unsubscribe https://github.com/notifications/unsubscribe-auth/AEYFSZFKURZD7WYX2HXLQ7TUF3WO3ANCNFSM5BZSYMKQ . Triage notifications on the go with GitHub Mobile for iOS https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675 or Android https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub.

johnjcamilleri commented 3 years ago

So in commit 1401a6d209006b0b6969109c4a26caad76ef0314 I have solved most of the macOS compilation problems, except this:

pgf/db.cxx:887:64: error: use of undeclared identifier 'MREMAP_MAYMOVE'
                (malloc_state*) mremap(ms, old_size, new_size, MREMAP_MAYMOVE);

It seems that mremap is not supported in macOS at all, see here for example: https://stackoverflow.com/questions/3521303/is-there-really-no-mremap-in-darwin

johnjcamilleri commented 3 years ago

Here's a fix for this from a Bitcoin repository, perhaps something similar can be used here: https://github.com/libbitcoin/libbitcoin-blockchain/commit/e23b6e0bd92223a27c981678cd630d5e57c31b81#diff-92ae62ce6f8e61467227bcbd38ddb2876646ae15e36f5a08a91c41acfdb8e84dL87

krangelov commented 3 years ago

I didn't know about that limitation on Mac. It looks like it is the same story on Windows too.

What they did in the Bitcoin repository seems to be the only way. This however may have a negative impact on the performance for large grammars. We can improve the performance by allocating larger chunks. Right now the runtime always allocates pages one by one but if we see that this is a problem on Mac and Windows, we can make it allocate large chunks.

Will you try to make the fix? I have no way to test it myself. We can see if this is a performance issue later.

johnjcamilleri commented 3 years ago

Will you try to make the fix?

Sure

johnjcamilleri commented 3 years ago

In the process I've come across another portability problem, namely the initialisation of flexible array members, which is apparently a GCC extension:

pgf/pgf.cxx:34:34: error: initialization of flexible array member is not allowed
PgfText master = {size: 6, text: {'m','a','s','t','e','r',0}};

References:

This seems to be used in two places, for the definitions of master in pgf.cxx and wildcard in expr.cxx. I've tried a few things but nothing fits neatly in the current code. My ideas:

johnjcamilleri commented 3 years ago

Another issue related to the use of unsigned ints and EOF:

uint32_t PgfExprParser::getc()
{
    uint32_t ch = pgf_utf8_decode((const uint8_t **) &pos);
    if (pos - ((const char*) &inp->text) > inp->size) {
        ch = EOF;
    }
    return ch;
}
pgf/expr.cxx:368:7: error: case value evaluates to -1, which cannot be narrowed to type 'uint32_t' (aka 'unsigned int') [-Wc++11-narrowing]
        case EOF:
             ^
johnjcamilleri commented 3 years ago

Also:

ld: library not found for -lrt

Removing it from libpgf_la_LIBADD in Makefile.am seems to allow everything to compile.

johnjcamilleri commented 3 years ago

I've pushed my current changes (WIP) to the branch majestic-macos, relevant diff is here: https://github.com/GrammaticalFramework/gf-core/compare/8c721e063cf9de1256d2d12054c0b4521ffc109b...majestic-macos

krangelov commented 3 years ago

On Tue, 12 Oct 2021 at 14:59, John J. Camilleri @.***> wrote:

In the process I've come across another portability problem, namely the initialisation of flexible array members, which is apparently a GCC extension:

pgf/pgf.cxx:34:34: error: initialization of flexible array member is not allowed PgfText master = {size: 6, text: {'m','a','s','t','e','r',0}};

It is a pity that it doesn't work with CLang but you found a good workaround.

krangelov commented 3 years ago

I refactored the code to avoid using EOF. Using it was a bad idea anyway.

On Tue, 12 Oct 2021 at 15:23, John J. Camilleri @.***> wrote:

Another issue related to the use of unsigned ints and EOF:

uint32_t PgfExprParser::getc() { uint32_t ch = pgf_utf8_decode((const uint8_t *) &pos); if (pos - ((const char) &inp->text) > inp->size) { ch = EOF; } return ch; }

pgf/expr.cxx:368:7: error: case value evaluates to -1, which cannot be narrowed to type 'uint32_t' (aka 'unsigned int') [-Wc++11-narrowing] case EOF: ^

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/GrammaticalFramework/gf-core/issues/130#issuecomment-941009848, or unsubscribe https://github.com/notifications/unsubscribe-auth/AEYFSZETNPYFMJXDIXKGPNLUGQZGJANCNFSM5BZSYMKQ . Triage notifications on the go with GitHub Mobile for iOS https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675 or Android https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub.

krangelov commented 3 years ago

librt is needed ipc.cxx but that module is not used yet. That is why removing it worked. It will have to be reconsidered later.

On Tue, 12 Oct 2021 at 15:26, John J. Camilleri @.***> wrote:

Also:

ld: library not found for -lrt

Removing it from libpgf_la_LIBADD in Makefile.am seems to allow everything to compile.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/GrammaticalFramework/gf-core/issues/130#issuecomment-941011837, or unsubscribe https://github.com/notifications/unsubscribe-auth/AEYFSZFOINJEGBP2IPXFSOLUGQZPFANCNFSM5BZSYMKQ . Triage notifications on the go with GitHub Mobile for iOS https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675 or Android https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub.

krangelov commented 3 years ago

I also fixed the issue with the wildcard.

On Tue, 12 Oct 2021 at 18:50, Krasimir Angelov @.***> wrote:

librt is needed ipc.cxx but that module is not used yet. That is why removing it worked. It will have to be reconsidered later.

On Tue, 12 Oct 2021 at 15:26, John J. Camilleri @.***> wrote:

Also:

ld: library not found for -lrt

Removing it from libpgf_la_LIBADD in Makefile.am seems to allow everything to compile.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/GrammaticalFramework/gf-core/issues/130#issuecomment-941011837, or unsubscribe https://github.com/notifications/unsubscribe-auth/AEYFSZFOINJEGBP2IPXFSOLUGQZPFANCNFSM5BZSYMKQ . Triage notifications on the go with GitHub Mobile for iOS https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675 or Android https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub.

johnjcamilleri commented 3 years ago

librt is needed ipc.cxx but that module is not used yet. That is why removing it worked.

From what I understand, whatever is provided by librt on Linux is included in standard libraries on macOS, so it's just a case of the flag itself not being needed on macOS.

https://stackoverflow.com/questions/34379290/library-rt-not-found-on-mac-while-configuring-sfft-sparse-fast-fourier-transfo

johnjcamilleri commented 3 years ago

Well, compiling now works on macOS (yay!), but readPGF immediately gives me a segmentation fault (boo). I'll see what I can find out by debugging.

johnjcamilleri commented 3 years ago

I compiled and run a minimal example. Putting all details here for reference.

Code

#include "pgf/pgf.h"

int main () {
  PgfRevision rev;
  PgfExn err;
  PgfDB *obj = pgf_read_pgf("/Users/john/repositories/GF/gf-core-majestic/src/runtime/haskell/tests/basic.pgf", &rev, &err);
}

Command

clang -lpgf -L/usr/local/lib main.c
lldb -o run ./a.out

Output

Process 81037 stopped
* thread #1, queue = 'com.apple.main-thread', stop reason = EXC_BAD_ACCESS (code=1, address=0x10016d058)
    frame #0: 0x000000010012c7f2 libpgf.0.dylib`PgfDB::~PgfDB(this=0x0000000100404100) at db.cxx:368:17 [opt]
   365  PgfDB::~PgfDB() {
   366      if (ms != NULL) {
   367          size_t size =
-> 368              ms->top + chunksize(ptr(ms,ms->top)) + sizeof(size_t);
   369  
   370          munmap(ms,size);
   371      }
Target 0: (a.out) stopped.

Backtrace

* thread #1, queue = 'com.apple.main-thread', stop reason = EXC_BAD_ACCESS (code=1, address=0x10016d058)
  * frame #0: 0x000000010012c7f2 libpgf.0.dylib`PgfDB::~PgfDB(this=0x0000000100304100) at db.cxx:368:17 [opt]
    frame #1: 0x000000010012f971 libpgf.0.dylib`::pgf_read_pgf(fpath=<unavailable>, revision=<unavailable>, err=0x00007ffeefbff908) at pgf.cxx:70:9 [opt]
    frame #2: 0x0000000100003f2c a.out`main + 28
    frame #3: 0x00007fff2046bf3d libdyld.dylib`start + 1
    frame #4: 0x00007fff2046bf3d libdyld.dylib`start + 1

Since this doesn't happen in Linux (inside Docker), I first suspected that the culprit is this workaround for mremap in db.cxx line 887:

#ifndef MREMAP_MAYMOVE
            if (munmap(ms, old_size) == -1)
                throw pgf_systemerror(errno);
            malloc_state* new_ms =
                (malloc_state*) mmap(0, new_size, PROT_READ | PROT_WRITE, MAP_SHARED, fd, 0);
#else
            malloc_state* new_ms =
                (malloc_state*) mremap(ms, old_size, new_size, MREMAP_MAYMOVE);
#endif

However I a set breakpoint there which did not get triggered, so that can't be the problem? The backtrace mentions pgf.cxx:70, which is this:

69    if (db != NULL)
70        delete db;
johnjcamilleri commented 3 years ago

Slightly more investigation reveals that the read itself isn't working. If I comment out that destructor code, pgf_read_pgf gives me a PGF_EXN_SYSTEM_ERROR with code 9 (Bad file number).

johnjcamilleri commented 3 years ago

Also, loading from NGF with pgf_read_ngf seems to work perfectly well.

krangelov commented 3 years ago

I suspect the failure is in the mmap call in the constructor PgfDB::PgfDB:

int mflags = (fd < 0) ? (MAP_PRIVATE | MAP_ANONYMOUS) : MAP_SHARED;
ms = (malloc_state*)
    mmap(NULL, file_size, PROT_READ | PROT_WRITE, mflags, fd, 0);

Can you verify that?

When pgf_read_pgf is used it calls the constructor with NULL for the file name. This one the other hand tells it to create an anonymous memory mapped file. Maybe anonymous files work differently on Mac.

On Wed, 13 Oct 2021 at 15:03, John J. Camilleri @.***> wrote:

Also, loading from NGF with pgf_read_ngf seems to work perfectly well.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/GrammaticalFramework/gf-core/issues/130#issuecomment-942282614, or unsubscribe https://github.com/notifications/unsubscribe-auth/AEYFSZGHPW7GMBWQNFZU5RTUGV7QLANCNFSM5BZSYMKQ . Triage notifications on the go with GitHub Mobile for iOS https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675 or Android https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub.

krangelov commented 3 years ago

This page:

https://developer.apple.com/library/archive/documentation/System/Conceptual/ManPages_iPhoneOS/man2/mmap.2.html

says that on Mac MAP_ANON should be used. On Linux MAP_ANON is deprecated but is otherwise a synonym to MAP_ANONYMOUS.

On Wed, 13 Oct 2021 at 17:10, Krasimir Angelov @.***> wrote:

I suspect the failure is in the mmap call in the constructor PgfDB::PgfDB:

int mflags = (fd < 0) ? (MAP_PRIVATE | MAP_ANONYMOUS) : MAP_SHARED;
ms = (malloc_state*)
    mmap(NULL, file_size, PROT_READ | PROT_WRITE, mflags, fd, 0);

Can you verify that?

When pgf_read_pgf is used it calls the constructor with NULL for the file name. This one the other hand tells it to create an anonymous memory mapped file. Maybe anonymous files work differently on Mac.

On Wed, 13 Oct 2021 at 15:03, John J. Camilleri @.***> wrote:

Also, loading from NGF with pgf_read_ngf seems to work perfectly well.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/GrammaticalFramework/gf-core/issues/130#issuecomment-942282614, or unsubscribe https://github.com/notifications/unsubscribe-auth/AEYFSZGHPW7GMBWQNFZU5RTUGV7QLANCNFSM5BZSYMKQ . Triage notifications on the go with GitHub Mobile for iOS https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675 or Android https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub.

johnjcamilleri commented 3 years ago

I've stepped through in the debugger but I cannot see anything going wrong with the call to mmap. The check with MAP_FAILED immediately afterwards is false. That page you linked is outdated, plus seems to be for iOS not macOS. When I run man mmap on my system it says:

     MAP_ANONYMOUS     Synonym for MAP_ANON.

I couldn't find the documentation online, so here's a gist of it: man mmap

It seems to be this call in PgfReader::read_pgf which is causing the crash:

read_abstract(ref<PgfAbstr>::from_ptr(&pgf->abstract));

I know that read_absract is never reached. When I try to print pgf->abstract in the debugger I get the message "Couldn't lookup symbols" (maybe I need to recompile the runtime with/out some flags).

krangelov commented 3 years ago

To have access to the symbol information in the debugger, try this:

Do you know if there is any way to get a virtual Mac on Ubuntu? I can emulate Windows by using Wine but I don't know about Mac.

Best Regards, Krasimir

On Thu, 14 Oct 2021 at 09:00, John J. Camilleri @.***> wrote:

I've stepped through in the debugger but I cannot see anything going wrong with the call to mmap. The check with MAP_FAILED immediately afterwards is false. That page you linked is outdated, plus seems to be for iOS not macOS. When I run man mmap on my system it says:

 MAP_ANONYMOUS     Synonym for MAP_ANON.

I couldn't find the documentation online, so here's a gist of it: man mmap https://gist.github.com/johnjcamilleri/1cdc73d25420302035fb40df1aac2d75#file-mmap

It seems to be this call in PgfReader::read_pgf which is causing the crash:

read_abstract(ref::from_ptr(&pgf->abstract));

I know that read_absract is never reached. When I try to print pgf->abstract in the debugger I get the message "Couldn't lookup symbols" (maybe I need to recompile the runtime with/out some flags).

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/GrammaticalFramework/gf-core/issues/130#issuecomment-943069501, or unsubscribe https://github.com/notifications/unsubscribe-auth/AEYFSZFZYZB75PTXT3HPD53UGZ5YVANCNFSM5BZSYMKQ . Triage notifications on the go with GitHub Mobile for iOS https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675 or Android https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub.

johnjcamilleri commented 3 years ago

Thanks! Either I was doing something wrong before, or turning off optimisation has changed things a little, such that now it seems that the call to

    abstract->funs = read_namespace<PgfAbsFun>(&PgfReader::read_absfun);

in PgfReader::read_abstract is the source. At some point it seems to "crash" and then the destructor of PgfDB suddenly gets called:

Process 99205 stopped
* thread #1, queue = 'com.apple.main-thread', stop reason = breakpoint 1.1
    frame #0: 0x0000000100140a1c libpgf.0.dylib`PgfReader::read_absfun(this=0x00007ffeefbff888) at reader.cxx:382:9
   379  ref<PgfAbsFun> PgfReader::read_absfun()
   380  {
   381      ref<PgfAbsFun> absfun =
-> 382          read_name<PgfAbsFun>(&PgfAbsFun::name);
   383      absfun->ref_count = 1;
   384      ref<PgfExprFun> efun =
   385          ref<PgfExprFun>::from_ptr((PgfExprFun*) &absfun->name);
Target 0: (a.out) stopped.
(lldb) n
Process 99205 stopped
* thread #1, queue = 'com.apple.main-thread', stop reason = EXC_BAD_ACCESS (code=1, address=0x100189058)
    frame #0: 0x000000010012a7f7 libpgf.0.dylib`PgfDB::~PgfDB(this=0x00000001003044d0) at db.cxx:369:17
   366  PgfDB::~PgfDB() {
   367      if (ms != NULL) {
   368          size_t size =
-> 369              ms->top + chunksize(ptr(ms,ms->top)) + sizeof(size_t);
   370  
   371          munmap(ms,size);
   372      }
Target 0: (a.out) stopped.

No, I know nothing about emulating macOS in Ubuntu. A quick search seems to indicate that it's actively discouraged by Apple. You could test things in a GitHub Actions workflow (which I will set up for macOS anyway) although testing by pushing & waiting is hardly optimal either. Perhaps we should schedule a meeting and debug together.

krangelov commented 3 years ago

Do you have time tomorrow for a meeting? I am available between 13:00-14:00 and 15:00-16:00.

On Thu, 14 Oct 2021 at 10:59, John J. Camilleri @.***> wrote:

Thanks! Either I was doing something before, or turning off optimisation has changed things a little, such that now it seems the call to

abstract->funs = read_namespace<PgfAbsFun>(&PgfReader::read_absfun);

in PgfReader::read_abstract is the source. At some point it seems to "crash" and then the destructor of PgfDB suddenly gets called:

Process 99205 stopped

  • thread #1, queue = 'com.apple.main-thread', stop reason = breakpoint 1.1 frame #0: 0x0000000100140a1c libpgf.0.dylib`PgfReader::read_absfun(this=0x00007ffeefbff888) at reader.cxx:382:9 379 ref PgfReader::read_absfun() 380 { 381 ref absfun = -> 382 read_name(&PgfAbsFun::name); 383 absfun->ref_count = 1; 384 ref efun = 385 ref::from_ptr((PgfExprFun*) &absfun->name); Target 0: (a.out) stopped. (lldb) n Process 99205 stopped
  • thread #1, queue = 'com.apple.main-thread', stop reason = EXC_BAD_ACCESS (code=1, address=0x100189058) frame #0: 0x000000010012a7f7 libpgf.0.dylib`PgfDB::~PgfDB(this=0x00000001003044d0) at db.cxx:369:17 366 PgfDB::~PgfDB() { 367 if (ms != NULL) { 368 size_t size = -> 369 ms->top + chunksize(ptr(ms,ms->top)) + sizeof(size_t); 370
    371 munmap(ms,size); 372 } Target 0: (a.out) stopped.

No, I know nothing about emulating macOS in Ubuntu. A quick search seems to indicate that it's actively discouraged by Apple. You could test things in a GitHub Actions workflow (which I will set up for macOS anyway) although testing by pushing & waiting is hardly optimal either. Perhaps we should schedule a meeting and debug together.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/GrammaticalFramework/gf-core/issues/130#issuecomment-943156428, or unsubscribe https://github.com/notifications/unsubscribe-auth/AEYFSZBM32EN4JGHVDVBBXLUG2LVHANCNFSM5BZSYMKQ . Triage notifications on the go with GitHub Mobile for iOS https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675 or Android https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub.

johnjcamilleri commented 3 years ago

I realized what is probably happening if indeed you actually go to the munmap/mmap part of the code.

When we have a file (i.e. fd > 0), then munmap/mmap works since the data is stored in the file. On the other hand when you have fid < 0 then munmap is just a more efficient version of free, and mmap becomes equivalent to malloc. In that case, of course, after free+malloc we will lose all the data.

On Mac the case when fd < 0 should be treated differently. In PgfDB::PgfDB the initial block should be allocated with malloc. In PgfDB::malloc_internal, instead of munmap+mmap we should use realloc().

I understand what you're suggesting and tried to make the changes in the majestic-macos branch, however there's a compilation error I can't understand:

pgf/db.cxx:331:28: error: no matching function for call to 'malloc'
      ms = (malloc_state*) malloc(file_size); // doesn't compile
                           ^~~~~~

More generally, what is the reason that this needs to be different for macOS?

krangelov commented 3 years ago

In db.cxx when you say malloc the compiler tries to call PgfDB::malloc instead of the standard malloc. Try using "::malloc". You may also have to add #include .

On Fri, 15 Oct 2021 at 15:28, John J. Camilleri @.***> wrote:

I realized what is probably happening if indeed you actually go to the munmap/mmap part of the code.

When we have a file (i.e. fd > 0), then munmap/mmap works since the data is stored in the file. On the other hand when you have fid < 0 then munmap is just a more efficient version of free, and mmap becomes equivalent to malloc. In that case, of course, after free+malloc we will lose all the data.

On Mac the case when fd < 0 should be treated differently. In PgfDB::PgfDB the initial block should be allocated with malloc. In PgfDB::malloc_internal, instead of munmap+mmap we should use realloc().

I understand what you're suggesting and tried to make the changes in the majestic-macos branch https://github.com/GrammaticalFramework/gf-core/compare/majestic...majestic-macos, however there's a compilation error I can't understand:

pgf/db.cxx:331:28: error: no matching function for call to 'malloc' ms = (malloc_state*) malloc(file_size); // doesn't compile ^~

More generally, what is the reason that this needs to be different for macOS?

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/GrammaticalFramework/gf-core/issues/130#issuecomment-944302290, or unsubscribe https://github.com/notifications/unsubscribe-auth/AEYFSZEIQNHY42LK5D6BIBLUHAT7BANCNFSM5BZSYMKQ . Triage notifications on the go with GitHub Mobile for iOS https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675 or Android https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub.

johnjcamilleri commented 3 years ago

Right, thanks! This seems to have worked. I'm now getting a lot further and pass all the Haskell "basic" tests. Now I'm getting a new error:

transactions: branchPGF: invalid argument (Invalid argument)

but I will need to debug this further. Could it be related to the call to mmap in ipc.cxx?

krangelov commented 3 years ago

ipc.cxx is not used yet. The problem should be something else.

On Fri, Oct 15, 2021, 17:48 John J. Camilleri @.***> wrote:

Right, thanks! This seems to have worked. I'm now getting a lot further and pass all the Haskell "basic" tests. Now I'm getting a new error:

transactions: branchPGF: invalid argument (Invalid argument)

but I will need to debug this further. Could it be related to the call to mmap in ipc.cxx?

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/GrammaticalFramework/gf-core/issues/130#issuecomment-944407006, or unsubscribe https://github.com/notifications/unsubscribe-auth/AEYFSZFISFCGEAKISGCXZY3UHBELRANCNFSM5BZSYMKQ . Triage notifications on the go with GitHub Mobile for iOS https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675 or Android https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub.

johnjcamilleri commented 3 years ago

If I want to use the runtime from C, surely I don't need to implement my own un/marshaller? I see there are implementations of all the marshalling functions in expr.cxx, but these are not exposed in any way. Maybe if one passes NULL when an unmarshaller is needed, it can/should use these internal implementations?

johnjcamilleri commented 3 years ago

The cause behind the 'Invalid argument' error mentioned above seems to be this line in PgfDB::sync:

int res = msync((void *) ms, size, MS_SYNC | MS_INVALIDATE);

Which returns -1. Any idea how this should be changed for macOS (when ms is malloc'ed)?

johnjcamilleri commented 3 years ago

For reference, macOS man page for msync: https://gist.github.com/johnjcamilleri/1cdc73d25420302035fb40df1aac2d75#file-msync

johnjcamilleri commented 3 years ago

Any idea how this should be changed for macOS (when ms is malloc'ed)?

I tested not calling mysnc at all in this case, and seems to work on all platforms (see CI here). But I'm not sure if it's the right solution.

krangelov commented 3 years ago

If fd < 0 then skipping msync is safe on all platforms. On Linux it will just do nothing in that case.

On Mon, Oct 18, 2021, 13:59 John J. Camilleri @.***> wrote:

Any idea how this should be changed for macOS (when ms is malloc'ed)?

I tested not calling mysnc at all in this case, and seems to work on all platforms (see CI here https://github.com/GrammaticalFramework/gf-core/actions/runs/1354485057). But I'm not sure if it's the right solution.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/GrammaticalFramework/gf-core/issues/130#issuecomment-945691444, or unsubscribe https://github.com/notifications/unsubscribe-auth/AEYFSZFGG6PZTPAAFQM3CADUHQD2VANCNFSM5BZSYMKQ . Triage notifications on the go with GitHub Mobile for iOS https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675 or Android https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub.

krangelov commented 3 years ago

The (un)marshaller in expr.cxx only works with the internal database representations of the expressions. If someone wants to use the library from C then he/she should define own data structures and marshallers. It is also possible to define marshallers who start from an application specific data structure and then do NLG without explicitly generating the abstract syntax tree.

I haven't anyone yet who would like to use the runtime from C :-)

On Mon, Oct 18, 2021, 13:11 John J. Camilleri @.***> wrote:

If I want to use the runtime from C, surely I don't need to implement my own un/marshaller? I see there are implementations of all the marshalling functions in expr.cxx, but these are not exposed in any way. Maybe if one passes NULL when an unmarshaller is needed, it can/should use these internal implementations?

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/GrammaticalFramework/gf-core/issues/130#issuecomment-945658684, or unsubscribe https://github.com/notifications/unsubscribe-auth/AEYFSZG2G2RGREGZY4M3VNDUHP6GDANCNFSM5BZSYMKQ . Triage notifications on the go with GitHub Mobile for iOS https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675 or Android https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub.

johnjcamilleri commented 3 years ago

Ok, thanks for the clarifications!

I haven't anyone yet who would like to use the runtime from C :-)

Haha fair enough. I did this myself for testing purposes, in order to exclude the bindings themselves as a source of problems, but actually I managed to avoid using any functions that required un/marshalling.