leanprover / lean4

Lean 4 programming language and theorem prover
https://lean-lang.org
Apache License 2.0
4.63k stars 414 forks source link

Lean IO.FS namespace it is too hard to open a file and read it. #864

Closed lovettchris closed 2 years ago

lovettchris commented 2 years ago

Prerequisites

Description

I found it quite difficult to figure out how to just open a file as a Stream. I ended up having to search the lean codebase and found the Handle.mk constructor. Perhaps we could use some helper methods like these which show up in many other languages/frameworks:

def openFileForRead(fname : FilePath) : IO FS.Stream := do
  let hTee ← FS.Handle.mk fname FS.Mode.read true
  return FS.Stream.ofHandle hTee

def openFileForWrite(fname : FilePath) : IO FS.Stream := do
  let hTee ← FS.Handle.mk fname FS.Mode.write true
  return FS.Stream.ofHandle hTee

We do already have this helper method which is great, but sometimes you need a Stream because you need to pass it to another function.

def readFile (fname : FilePath) : IO String := do
  let h ← Handle.mk fname Mode.read false
  h.readToEnd

Steps to Reproduce

N/A

Expected behavior: [What you expect to happen]

Actual behavior: [What actually happens]

Reproduces how often: [What percentage of the time does it reproduce?]

Versions

You can get this information from copy and pasting the output of lean --version, please include the OS and what version of the OS you're running.

Additional Information

Any additional information, configuration or data that might be necessary to reproduce the issue.

Kha commented 2 years ago

but sometimes you need a Stream because you need to pass it to another function.

What kind of function? I think Handle should be the standard type, except that stdin etc. are of course Streams. I wonder if the stream isolation is worth this complication in the API after all. It's not like other languages have an analogue of Stream.

tydeu commented 2 years ago

after all. It's not like other languages have an analogue of Stream.

Haskell has io-streams which has a somewhat similar approach. One major difference is the split between an InputStream and an OutputStream which Lean doesn't have.

lovettchris commented 2 years ago

Right and .NET/C# world, everything is pretty much a Stream (Files, Sockets, Databases) and then they build "readers" and "writers" on top of streams and passing those around everywhere, for example to XmlReader and XmlWriters, or a Jsonreader, JsonWriter, etc... you can do a full "pull model" database "query" that returns a stream of data, which is converted in a streaming fashion to a stream of objects which is then converted to a stream of Json, super efficiently. Streams are important when you can't just pull the entire thing into memory because it is 20 terabytes of data, etc. Oh and in this world everything is NOT a fully random access file handle, they are usually only forward only streams.

tydeu commented 2 years ago

@lovettchris Very true -- streams are a important concept in imperative programming. Sadly, they are a lost less popular in functional programming languages because they are inherently impure. Thus, you tend to only see them in places where they are absolutely necessary (i.e,, high performant code dealing with very big data like in your examples). As such, it is not surprising that the rather bare bones standard library of Lean lacks much support for them. However, it is certainly something that should go on the TODO list.

lovettchris commented 2 years ago

@tydeu that seems like a fun challenge, how to bring stream processing into the functional world. I mean the individual step of "here's the next value please handle it" is surely something a functional program would be really good at, perhaps we need to create a way to totally hide the fact that the stream reader is having a side effect on the state of the stream, because that is such an unimportant side effect in the world of stream processing. The only thing you need to deal with is an Exception must be raised if the stream is interrupted (socket closed randomly etc). And perhaps passing around Except is annoying... can that also be hidden in some kind of implicit context... perhaps some sort of "interruptable/cancellable context" or something (because app driven cancellation is also important so resources can be freed up asap) etc.

tydeu commented 2 years ago

@lovettchris the more novel way to such stateful primitives like streams more painless / easier to manage in functional programming languages is through linear logic and linear / uniqueness types (e.g., Clean, Idris, etc.).

However, that requires altering the foundational logic of the language which is much more a concern for a theorem prover than it is in more general purpose programming languages. It is especially problematic for something like Lean, which also wants to appeal to classical mathematicians (who would probably like Lean to be more classical rather than less). While classical logic is still embeddable in such a system, it still introduces more conceptual hurdles along the way.

As such, in Lean, the answer for streams is likely still just to use monads in the usual manner, with the major advantage that doing so is relatively painless in Lean as a result of its very nice do notation. It is simply a matter of getting around to writing and maintaining all the supporting APIs.

lovettchris commented 2 years ago

Ok, sounds deep, which brings me back to, I still find these helper functions easier to discover and use: openFileForRead and openFileForWrite... how about it?

tydeu commented 2 years ago

@lovettchris why not just a single openFile?

def openFile (fileName : FilePath) (mode : FS.Mode) (bin : Bool := true) : IO FS.Stream := do
  FS.Stream.ofHandle (← FS.Handle.mk fileName mode bin)

that would be more consistent with open APIs in other languages (e.g., Python) which generally include the mode as a required parameter.

EDIT: it might be best to name the function FS.Stream.openFile to prevent confusion with Handle.mk.

lovettchris commented 2 years ago

That's still better than nothing, but not quite as easy as openFileForRead and openFileForWrite where the mode is implied in the function name.

tydeu commented 2 years ago

@lovettchris

That's still better than nothing, but not quite as easy as openFileForRead and openFileForWrite where the mode is implied in the function name.

Do you know of language(s) that do this? In every language I can recall, the mode is always passed as an argument to the relevant 'open file' command and not part of the function's name (e.g., C, C#, Python, Ruby, etc.). As such, I think doing otherwise would be rather counterintuitive to users coming from other languages (or users who swap back and forth often).

Kha commented 2 years ago

Haskell has io-streams which has a somewhat similar approach.

Similar-ish, but I meant specifically our Stream introduced specifically for the std streams. We know the stream isolation is leaky anyway (e.g. FFI functions can write directly to stdout), so the benefit is questionable. We could instead revert the std streams to Handle for a more regular API and capture only the output of IO.(e)print in the server. If you explicitly take a std stream and write to it, we'll assume you know what you're doing.

gebner commented 2 years ago

Do you know of language(s) that do this? In every language I can recall, the mode is always passed as an argument to the relevant 'open file' command and not part of the function's name (e.g., C, C#, Python, Ruby, etc.).

Rust does: https://doc.rust-lang.org/std/fs/struct.File.html#method.open

tydeu commented 2 years ago

@gebner I was inquiring about languages that have separate openRead and openWrite functions (as that was what Chris was talking about). Your Rust example is just a single function open that is does openRead only (there is no corresponding openWrite). That would be more equivalent to:

def openFile (fileName : FilePath) (mode := FS.Mode.read) (bin : Bool := true) : IO FS.Stream := do
  FS.Stream.ofHandle (← FS.Handle.mk fileName mode bin)

but just more restrictive. It is also largely an artifact of the OpenOptions concept in Rust, which does not have a corresponding equivalent in Lean.

EDIT: I guess the create function is sort of an openWrite, though it has more meaning in Rust than just that. Still, an openFile/createFile split would be a fine idea as well. Though, I think the split works better in Rust than it does here because it has finer grain open options (with a create setting) that allow for more intuitive meaning to the split.

leodemoura commented 2 years ago

I am closing the issue. We are planning to have a standard library for Lean in the future (see issue #528). We can continue the discussion there.