mthom / scryer-prolog

A modern Prolog implementation written mostly in Rust.
BSD 3-Clause "New" or "Revised" License
2k stars 116 forks source link

mmap for library(pio) with partial strings #251

Open UWN opened 4 years ago

UWN commented 4 years ago

((This is for a later moment after #24 #95 is done))

For UTF-8 files not containing a zero-byte (the majority of files to be parsed), phrase_from_file(Phrase__0, File) could avoid incremental copying altogether using mmap(3). The file is mapped at once into a fitting memory area of the heap up to the last page, which is written anew with a terminating zero-byte and a nil at the end.

UWN commented 4 years ago

... which means in the worst case that two pages are needed: The file ends one byte before the next page. And thus an additional page is needed just for the []

UWN commented 4 years ago

Just to be sure: phrase_from_file/3 would need to first open the file, mmap it, and scan it for a zero-byte and malformed UTF-8 encodings, reporting them immediately. Further, the number of characters can be determined this way, should this be of use somehow.

Note that even in the presence of a zero-byte mmap still can be used, at least for the sequence up to that zero-byte.

UWN commented 3 years ago

Tiny moral update: mmap faster than syscalls, because it uses AVX-instructions.