get_builtins_host_obj_path requires `lib/` when that doesn't make sense for the platform - Githubissues

roc-lang / roc

A fast, friendly, functional language.

https://roc-lang.org

Universal Permissive License v1.0

4.42k stars 310 forks source link

get_builtins_host_obj_path requires `lib/` when that doesn't make sense for the platform #3718

Open BrianHicks opened 2 years ago

BrianHicks commented 2 years ago

As of 43f9e2a7 (#1848), Roc requires a lib directory to be present by requiring it with an expect in get_builtins_host_obj_path in crates/compiler/builtins/src/bitcode.rs. This doesn't always make sense, for example in Rust platforms. Should that function return an Option<String> or something?

cc @Anton-4 since you were working on that PR

Anton-4 commented 2 years ago

If get_builtins_host_obj_path is not needed in that situation then why is it called? All rust platforms call build_c_host_native and build_chost_native requires get_builtins_host_obj_path (see compiler/build/src/link.rs).

Is get_builtins_host_obj_path failing in a specific situation?

Anton-4 commented 2 years ago

In compiler/cli/src/build.rs bitcode::get_builtins_host_obj_path() is now called due to change of mine even when the OptLevel is not Development. With the env var there was a const String so I now have to call it in all cases so that the function can own the String. In hindsight using an option there would probably work but if get_builtins_host_obj_path() is failing there it would also fail if the OptLevel actually is Development (this opt_level is set with a flag).

BrianHicks commented 2 years ago

Is get_builtins_host_obj_path failing in a specific situation?

It's when lib doesn't exist. Creating an empty one resolves the issue. The code for this platform lives in src, though.

Anton-4 commented 2 years ago

The dir should be created automatically, perhaps that does not work yet with the nightly releases :thinking: Are you using a nightly release?

BrianHicks commented 2 years ago

hah, I'm not using a release at all, but pulling from trunk and rebuilding. TBH I forgot they existed!

That said, I pulled just a bit ago and 0ca0da85cfa6a68cc5e3af07b0935a8e3f471d4c still has this problem.

Why should the dir be created automatically? Do I need to track it in my platform?

Anton-4 commented 2 years ago

compiler/builtins/build.rs::get_lib_dir creates it if it does not exist. There should be no need to track it in the platform.

Can you share the path you put your code in, the path you are running it from and the command you use to run it?

BrianHicks commented 2 years ago

The project is https://github.com/rtfeldman/rbt. I'm building roc and putting in the root, then invoking it like ./roc run examples/hello/rbt.roc

Anton-4 commented 2 years ago

After building roc, the lib dir should also be in the target dir, copying that over too should fix the issue. Alternatively, changing your working directory to roc/target/release and running ./roc /fullpath/examples/hello/rbt.roc is probably the easiest way because there is no need to copy anything.

Anton-4 commented 2 years ago

To improve the expect message, I would suggest: "Failed to find the lib directory. Did you copy the roc binary without also copying the lib directory?". What do you think @BrianHicks ?

BrianHicks commented 2 years ago

That seems reasonable!

However, are we gonna require people to copy a lib directory around wherever roc also is? Why isn't the compiler just a standalone binary?

Anton-4 commented 2 years ago

"However, are we gonna require people to copy a lib directory around wherever roc also is?" No manual copying will be necessary for releases.

"Why isn't the compiler just a standalone binary?" I believe this is so that the builtins can be linked in with the final user program and the whole thing can be optimized as one unit. Right @folkertdev?

rtfeldman commented 2 years ago

The long-term goal is definitely to have the compiler be a standalone binary, so I'm adding the bug label to this!

bhansconnect commented 2 years ago

I added this (the lib folder, not the dependency) because I got tired of roc not working in other directories. It only worked when in the roc root directory or one level deeper. This made it work from any folder.

If these files are only used for zig platforms, I think that we should remove the lib directory completely. There should be a dedicated zig library for Roc types (or they should be generated by glue). Frankly, the files as they are now are a convenience for zig, but they are not what we would actually want to expose as a zig library. They have a lot of random functions, are more powerful than necessary, and use a bunch of c abi that shouldn't be needed when called from zig only.

bhansconnect commented 2 years ago

Also, just to be fair, most language compilers are not standalone binaries. They really on standard libraries and etc. Even zig has a lib folder that must be copied around. So I think it is the exception not the rule that a compiler can be a standalone binary. Just when installing, both the library and the binary get placed in the correct spot. Roc may be one of the few compilers where we can avoid this, but that is cause our standard library for host will be pulled in by the host and our standard library for apps should be fully ingested (small load time cost).

ayazhafiz commented 2 years ago

If are able to install the compiler without needing to install additional libraries, I think we should definitely prefer that. It makes our the compiler devs' lives easier, but more importantly the users' lives easier.

bhansconnect commented 2 years ago

Oh shoot, I forgot that this is used for the host object and the zig source files.

So another addition. The host object file doesn't make sense and needs to be remove. Host is a very specific target and does not makes senses to distribute. It could use instructions that are not available to the end user.

I am pretty sure this object file is only needed for the dev backend. This code is generated with llvm bytecode in the main backend. It might be reasonable to generate generic versions of this file for each architecture and ingest them into the roc app, but that would bloat the roc app for every single target we want to support (currently by about 260k). Not huge, but would add up across os, cpu architecture pairs.

We don't want Roc to depend on zig being installed, so we don't want to ingest the zig files directly. Though that would be a simple solution. Ingest the zig files and compile them into an object file on the host. Instead, I think the best solution here is to use the ingested llvm bytecode to generate this object the first time that roc is run per target. Then we dump it to disk in some sort of global roc related cache and it will only need to be generated again when Roc updates.

rtfeldman commented 2 years ago

It might be reasonable to generate generic versions of this file for each architecture and ingest them into the roc app, but that would bloat the roc app for every single target we want to support (currently by about 260k). Not huge, but would add up across os, cpu architecture pairs.

This is the solution I prefer.

It's been a goal from day one of the project to have one single binary that you can download anywhere and immediately execute on any of the major operating systems.

I also think that since surgical linking lets us support cross-compiling, we'll want to bundle stdlib implementations for all the different targets anyway; they won't sit unused if you're cross-compiling!

Bundling Zig is always an option, but at +40MB we'd need quite a lot of target combinations before that was a cost savings!

bhansconnect commented 2 years ago

Since we have to ingest the llvm bytecode files anyway, is there an advantage to ingesting both the object files and the bytecode files? With the bytecode files there is just an extra step on the first compile with a dev backend. It has no cost to the llvm backend.

we'll want to bundle stdlib implementations for all the different targets anyway

I'm not sure I undestand what you mean here. What do you mean by stdlib? Our standard library to my understanding is these llvm bytecode or object files. So this discussion is already about bundling them.

rtfeldman commented 2 years ago

Oh yeah, sorry - we are talking about the same thing! My wording wasn't ideal. 😅

Since we have to ingest the llvm bytecode files anyway, is there an advantage to ingesting both the object files and the bytecode files? With the bytecode files there is just an extra step on the first compile with a dev backend. It has no cost to the llvm backend.

This makes sense to me! 👍

We'll want a global ~/.roc-cache (or whatever) directory to store things like downloaded packages, so bundling the .bc files into the executable and then using LLVM to go from .bc to object files on a per-(target + compiler version) basis and caching them in that directory sounds like a good approach!

bhansconnect commented 2 years ago

hopefully it will be ~/.cache/roc