janet-lang / janet

A dynamic language and bytecode vm
https://janet-lang.org
MIT License
3.51k stars 227 forks source link

Error when executing compiled executable #1222

Closed tionis closed 1 year ago

tionis commented 1 year ago

When I compile an executable with jpm's declare-executable the resulting binary does not work and only produces:

janet top level signal - "funcdef has invalid bytecode"
tionis commented 1 year ago

After git bisecting it seems c83f3ec09757eb48bf7d48f3063b39e4d8bd9345 is the problem

bakpakin commented 1 year ago

Perhaps try clearing jpm cache and rebuild all dependencies from scratch, the addition of new instructions would break existing compiled bytecode.

Given the bisected commit, without a repro I would have to guess this is user error - using one version of janet to build and different version of janet to execute somehow.

tionis commented 1 year ago

I'll try, but I reproduced the problem with a file just containing the following:

(defn main [_]
  (print "Hi!"))

and was getting the same error

tionis commented 1 year ago

Yeah clearing all manifests and the cache does nothing. The error occurs even when nothing is imported

tionis commented 1 year ago

I also cleared my lib and include dirs

bakpakin commented 1 year ago

I'm pretty sure your installation is linking in old code then. This does not reproduce for me on Linux with latest Janet and on master. On windows, perhaps an old janet.lib or libjanet.lib is being used which would certainly cause this error

tionis commented 1 year ago

Hm weird, I'm running with my local install on arch Linux. Using PREFIX=$HOME/.local deleting .local/lib and .local/include should remove all old code right? I'm also not sure why it would include old code as the file to compile imports nothing and I reinstall JPM everytime I recompile janet

tionis commented 1 year ago

I'll try to reproduce it somehow in a clean env and maybe inspect all elements of my local install again.

tionis commented 1 year ago

I can't reproduce the issue in a container or something like that, but it persists in my environment even after reinstalling everything manually after deleting all. Strace jpm build also doesn't really show any access to anything other than the cleanly built files

tionis commented 1 year ago

My collected data on this is at https://tasadar.net/tionis/janet-error

bakpakin commented 1 year ago

I'm guessing the linker is picking up libjanet.a then from somewhere - what does jpm show-paths show for libpath? This is where the static and dynamic libraries will be linked to the executables created by jpm.

sogaiu commented 1 year ago

FWIW, I cloned janet-error and did jpm build followed by jpm install.

Invoking the installed executable seemed to work:

$ janet-test-error 
Hi!
tionis commented 1 year ago

I also only have this problem on one machine (a ThinkPad yoga with arch Linux) but I can't find the error. I've completely nukes the install multiple times

tionis commented 1 year ago

@bakpakin commented on Jul 15, 2023, 2:37 AM GMT+2:

I'm guessing the linker is picking up libjanet.a then from somewhere - what does jpm show-paths show for libpath? This is where the static and dynamic libraries will be linked to the executables created by jpm.

That's what I thought too, but i looked into every include path on my system and there is no janet header to be found anywhere but in .local/lib jpm libpath also specifies the .local/lib path

sogaiu commented 1 year ago

I also only have this problem on one machine (a ThinkPad yoga with arch Linux) but I can't find the error. I've completely nukes the install multiple times

I think you mentioned elsewhere that you'd worked on an Arch Linux PKGBUILD that GrayJack was / is the maintainer of.

Is it possible there is some remnant from that or related AUR package?

May be you tried already, but sometimes I use a different user for testing...if that hasn't been tried already may be it's worth considering?

tionis commented 1 year ago

To be more complete here's the output of jpm show-paths

tree:
binpath:    /home/tionis/.local/bin
modpath:    /home/tionis/.local/lib/janet
syspath:    /home/tionis/.local/lib/janet
manpath:    /home/tionis/.local/share/man/man1
libpath:    /home/tionis/.local/lib/janet
headerpath: /home/tionis/.local/include/janet
buildpath:  build/
gitpath:    git
tarpath:    tar
curlpath:   curl
tionis commented 1 year ago

I think you mentioned elsewhere that you'd worked on an Arch Linux PKGBUILD that GrayJack was / is the maintainer of.

Is it possible there is some remnant from that or related AUR package?

I didn't test it on this machine I think, I also could find any remnants of another janet installation when searching through /usr/include /usr/local/include /usr/lib and friends

sogaiu commented 1 year ago

Long shot -- do you have any JANET_* environment variables set for the user?

tionis commented 1 year ago

May be you tried already, but sometimes I use a different user for testing...if that hasn't been tried already may be it's worth considering?

That was a good idea, the other user with his own clean install worked. So the problem must be in some lost files in the user dir of my main user

tionis commented 1 year ago

Long shot -- do you have any JANET_* environment variables set for the user?

The output of jpm show-paths should show where all of thos resolve to

sogaiu commented 1 year ago

The output of jpm show-paths should show where all of thos resolve to

May be so.

I wasn't sure about all of the values though:

(defn show-paths
  []
  (print "tree:       " (dyn :tree))
  (print "binpath:    " (dyn:binpath))
  (print "modpath:    " (dyn:modpath))
  (print "syspath:    " (dyn :syspath))
  (print "manpath:    " (dyn :manpath))
  (print "libpath:    " (dyn:libpath))
  (print "headerpath: " (dyn:headerpath))
  (print "buildpath:  " (dyn :buildpath "build/"))
  (print "gitpath:    " (dyn :gitpath))
  (print "tarpath:    " (dyn :tarpath))
  (print "curlpath:   " (dyn :curlpath)))

as I don't understand exactly what things like (dyn:libpath) (the calls that are not like (dyn :manpath) -- i.e. the plain dyn calls) and friends would end up with.

bakpakin commented 1 year ago

libpath looks wrong to me, I would have expected /home/tionis/.local/lib instead of /home/tionis/.local/lib/janet

tionis commented 1 year ago

I changed it and recompiled Janet and the demo repo and am still getting the same error

bakpakin commented 1 year ago

Is it possible you have (old) files in /usr/local or /usr that are getting pulled in? That would cause these kind of issues

tionis commented 1 year ago

That's one of the first things I checked and there aren't any old files.

sogaiu commented 1 year ago

May be it's worth running find (or equivalent) in the user's home directory [1] for libjanet.a (or files with janet in their name)?


[1] Possibly elsewhere as well.

tionis commented 1 year ago

@sogaiu commented on Jul 17, 2023, 12:57 AM GMT+2:

May be it's worth running find (or equivalent) in the user's home directory [1] for libjanet.a (or files with janet in their name)?


[1] Possibly elsewhere as well.

Originally posted by @sogaiu in https://github.com/janet-lang/janet/issues/1222#issuecomment-1637209773

I've already searched through my .local and /usr with fd and fzf for any mention of janet and the only files I could find were the freshly compiled lib/libjanet.so lib/janet/libjanet.a lib/libjanet.a lib/libjanet.so.1.29 lib/libjanet.so.1.29.1 include/janet.h include/janet/janet.h

bakpakin commented 1 year ago

Take a binary diff of those two libjanet.a files - there really should only be one and I don't know why you have two looking at the Makefile. Try deleting lib/janet/libjanet.a

bakpakin commented 1 year ago

Also, if you are using the AUR packages, the installation scripts maybe suspect.

Note that I do most of my development on an Arch system so Arch is not the issue, if you are using an AUR package it is likely the PKGBUILD file.

bakpakin commented 1 year ago

I wonder if your compiler is picking up libjanet.a from somewhere else on the system when linking.

Another test you can run:

Create a file temp.janet with following contents:

(defn main
  [&]
  (print "hello"))

Then run

jpm -v quickbin temp.janet temp; ./temp

And it should print "hello" as expected, and print the command use to compile.

Output on my machine:

generating executable c source temp.c from temp.janet...
compiling temp.c to build/temp.o...
cc -c temp.c -DJANET_BUILD_TYPE=release -std=c99 -I/usr/local/include/janet -I/usr/local/lib/janet -O2 -o build/temp.o
linking temp...
cc -std=c99 -I/usr/local/include/janet -I/usr/local/lib/janet -O2 -o temp build/temp.o /usr/local/lib/libjanet.a -lm -ldl -lrt -pthread -rdynamic
hello
tionis commented 1 year ago

@bakpakin commented on Jul 17, 2023, 1:36 AM GMT+2:

Take a binary diff of those two libjanet.a files - there really should only be one and I don't know why you have two looking at the Makefile. Try deleting lib/janet/libjanet.a

Originally posted by @bakpakin in https://github.com/janet-lang/janet/issues/1222#issuecomment-1637217223

They are both the same, but when I deleted lib/libjanet.a the problem went away. I discovered the a problem in my janet compile script. Mostly due to misunderstanding of some of the janet env variables. I will close this as user error, but I think some documentation on how to configure your janet environment with the correct env vars set would be useful.

tionis commented 1 year ago

I'm still very much confused why the same janet install script only failed on one of the three machines I tested it on.

tionis commented 1 year ago

Thanks for all the help in diagnosing this @bakpakin @sogaiu !

tionis commented 1 year ago

@bakpakin commented on Jul 17, 2023, 1:42 AM GMT+2:

Also, if you are using the AUR packages, the installation scripts maybe suspect.

Note that I do most of my development on an Arch system so Arch is not the issue, if you are using an AUR package it is likely the PKGBUILD file.

I also want to note that I didn't use the AUR package at all. I've been added as a maintainer of it and want to fix it at some time in the future. Ideally for such installs, the concept of system-global packages and user-local packages would exist

sogaiu commented 1 year ago

the concept of system-global packages and user-local packages would exist

Not sure I follow what this means, but I think each JANET_PATH-like thing can only be pointed at a single directory at a time. That is, they aren't like PATH which typically can be pointed at more than one directory.

It seems technically possible for there to be one PKGBUILD for folks who want a system-type install and another PKGBUILD for folks who want a PREFIX=$HOME/.local type of install.

Or may be I'm not understanding your original comment...

tionis commented 1 year ago

@sogaiu commented on Jul 18, 2023, 1:03 AM GMT+2:

the concept of system-global packages and user-local packages would exist

Not sure I follow what this means, but I think each JANET_PATH-like thing can only be pointed at a single directory at a time. That is, they aren't like PATH which typically can be pointed at more than one directory.

It seems technically possible for there to be one PKGBUILD for folks who want a system-type install and another PKGBUILD for folks who want a PREFIX=$HOME/.local type of install.

Or may be I'm not understanding your original comment...

What I mean is a setup more like python, where you have two package caches that are searched during import. First the local one (in .local or something similar) and then the global one (that only root can write to). When installing, jpm could take a flag to do a user-local package install or simply check which location it has write permission for and choose the first one.

I'll create a new discussion to debate this.