ballerina-platform / ballerina-spec

Ballerina Language and Platform Specifications
Other
167 stars 54 forks source link

Embedding static resources in Ballerina binaries #1100

Open sameerajayasoma opened 2 years ago

sameerajayasoma commented 2 years ago

The current design is suboptimal and works only with jBallerina: anything in the resources/ directory will be copied to the executable .jar file. But there is no API to access these resources at runtime, and one needs to use Java interop to access them.

We need a design that works for jBallerina and nBallerina.

We looked at several approaches. The first one is similar to the //go:embed directive, where we can annotate a module-level variable with the path of the file to be embedded. The file's content will be added as the variable value at compile time.

@embed {pattern:”hello.txt”}
final string s = “”;

@embed {pattern:”hello.txt”}
final readonly & byte[] b = [];

We discussed using compile-time functions to load the content of a file as the second approach.

const string s = foo:read("hello.txt");

Using configurable consts is the third approach. We considered doing compile-time configurable const in the initial stages of the configurable feature, but postponed because there were no real requirements.

configurable const string s = ?
jclark commented 2 years ago

My thoughts on these:

jclark commented 2 years ago

C is adding similar functionality - as #embed directive

https://thephd.dev/finally-embed-in-c23

sanjiva commented 2 years ago

Another more first class way to solve this would be to add a concept of a "data" variable: data string s = ("some string" | ?); And then use a const annotation @embed {} to further tell how to populate it.

This way the semantics are maintained - that it's a bit of data that is filled in at compile time with either a constant value or from a compile time resource. Not sure ? is the best way to say that some extra info is expected.

sameerajayasoma commented 1 year ago

I propose going ahead with the @embed approach here. It's similar to C #embed and Go //embed.

@jclark I didn't quite understand your point about the @embed annotation.

sameerajayasoma commented 1 year ago

Here are the proposed semantics of the @embed annotation:

hasithaa commented 1 year ago

Here are my thoughts on @sameerajayasoma's Proposal.

jclark commented 1 year ago

@sameerajayasoma My concern is that the language already specifies clear semantics for:

final string s = ''';

It is that the s is initialized to the empty string and cannot thereafter be reassigned to. The semantics of the proposed @embed annotation contradict these semantics. Annotations should be augmenting language semantics, not contradicting them.

I'm a strong -1 to the @embed annotation.

jclark commented 1 year ago

Why can't we do this with a regular library function?

manuranga commented 1 year ago

This a good usecase for compile time functions if and when we have them.

sanjiva commented 1 year ago

How about

@embed {pattern: "x.text"}
const string data = ?;

This is declaring a constant and saying the value is coming from somewhere else. The static annotation tells where the value is coming from.

Effectively this desugars to:

const string data = "blah blah blah";

assuming the file x.text contains "blah blah blah".

chathurace commented 1 year ago

Desugaring large text data into strings may fail the compiler. I experienced such compilation failures when very large text values (i.e. encoded EDI mappings) are assigned to strings as constants. Same text loading into a string at runtime does not cause any issue.

I think it's more flexible if we can introduce a library function to load resources at runtime.

sanjiva commented 1 year ago

Chathura you can always do this even now:


final string foo = check io:fileReadString("path-to-file");

But the catch is you have to get the file at the right place at runtime.

We can't do a Java style embedded jar embedded resources because the binary format will not be a zip file at some point in the future.

The compiler should ideally not load the bytes to memory but just transfer it to the output directly .. then it can avoid size issues.

chathurace commented 1 year ago

The problem with io:fileReadString(...) is that it cannot read resources from imported libraries. In my case, for a given collection of EDI mappings, I want to pack EDI mapping json files, generated Ballerina records and some EDI collection specific utility code into a Ballerina library. Then anyone can use this library to parse those EDI files without worrying about mappings, code generation, etc. However, due to this restriction, those EDI mapping json files packed in the library cannot be accessed.

With the proposed solution, if the compiler doesn't load resources at compile time, instead generate code to load resources only when they are referred, we can avoid size issues as you have mentioned.

sanjiva commented 1 year ago

That is not possible .. the source is not available at runtime.

What I was suggesting is the compiler can move the bytes from the source to the output binary file without bringing it into the compiler's memory.

chathurace commented 1 year ago

I was thinking something like keeping the bytes of resources in the output binary file, but load those into (output program's) memory at runtime only when a resource is referred.