BlockstreamResearch / simfony

Rust-like high-level language that compiles down to Simplicity bytecode. Work in progress.
19 stars 6 forks source link

Remove old witness parser #59

Closed uncomputable closed 1 month ago

uncomputable commented 2 months ago

Remove the old witness file parser which used the Simplicity bit encoding and which was prone to fail. Temporarily disable witness expressions (during satisfaction) because we don't have the new witness file parser yet. Present error messages to prevent confusion. Refactor the library functions compile and satisfy so they are usable by the web IDE, which handles strings instead of files on disk.

apoelstra commented 2 months ago

In 60c18e237472dd1ce925aeb920679a3d783479df:

Can you make this just take a generic R: std::io::Read rather than an Arc<str>? Then you can pass in a byteslice, a file, or whatever you want.

Otherwise ACK!

uncomputable commented 2 months ago

I like the idea of using generics, but Arc<str> does not implement std::io::Read. We would have to convert the input string into a byte slice. The compiler would then convert the bytes back into characters, because the compiler logic works in terms of Arc<str>. This feels like an unnecessary indirection.

The point of this PR was to split the file IO from the compiler. The compiler expects Arc<str> so we pass Arc<str> in the compile / satisfy methods. Callers can generate this Arc<str> using file IO or using other means, like the web IDE.

uncomputable commented 2 months ago

RichError expects Arc<str> for pretty errors (we can change that). The PEST parser expects &str for its parsing (we cannot change that). It would be nice to have the compile / satisfy methods take at least something with str. std::io::Read seems overly low-level.

apoelstra commented 2 months ago

The compiler would then convert the bytes back into characters, because the compiler logic works in terms of Arc. This feels like an unnecessary indirection

The conversion between characters and bytes is basically free (chars to bytes involves checking the 0x80 bit to see if its ASCII, and if so, doing nothing; bytes to chars is even less work).

The allocation of an Arc<str> is potentially expensive. Fine to leave it for now since the existing compiler needs them.

std::io::Read seems overly low-level.

For better or worse, it is our standard "streaming bytes" API. It isn't low-level. It supports streaming, buffering, limiting how many bytes are read. It lets callers choose between TCP streams, files, vectors, byte slices, cursors, etc., and easily track data such as how many bytes have been consumed.

apoelstra commented 2 months ago

Can you run cargo fmt on these commits?

uncomputable commented 1 month ago

Rebased, formatted and changed Arc<str> to &str.

Interesting that std::io::Read is Rust's streaming API. It seems very useful for applications that work on a few bytes at a time. I hope we can add support to the Simfony compiler at some time, but at the moment the PEST parser greedily expects a &str slice. If we passed a std::io::Read, then we would have to consume the iterator and store all bytes in a buffer, because &std needs to be continuous memory. At this point, passing &std is more direct.

An upside of &std is that there is no more allocation. I also realized that with_file supports a lazy allocation of Arc<str> in case of errors, which is nice :)