haskell / filepath

Haskell FilePath core library
BSD 3-Clause "New" or "Revised" License
66 stars 32 forks source link

A type for filepath components #234

Open mmhat opened 3 months ago

mmhat commented 3 months ago

While working on https://github.com/commercialhaskell/path/pull/192 I noted that it might be helpful to have a separate type for a path component for some use cases. Since we have the nice AFPP-style APIs that is nothing more than a slice of the underlying filepath:

module System.OsPath.Component where

-- | A component of a filepath, i.e. a filpath that does not contain any path separators.
newtype Component = Component (SliceOf OsPath)

-- This should probably live in the os-string package
data SliceOf = SliceOf
  { sliceOfOffset :: {-# UNPACK #-} Int -- ^ The offset where the slice starts in bytes
  , sliceOfLength :: {-# UNPACK #-} Int -- ^ The length of the slice in bytes
  , sliceOfFilePath :: {-# UNPACK #-} OsPath -- ^ The underlying filepath
  }

As far as I am concerned I am only interested in an addition for the AFPP-style filepaths.

Would the contribution of such a type and the API for working with it accepted? Is a CLC proposal needed?

hasufell commented 3 months ago

I have though about it, but it would require re-implementing most of the OsString API, similar to Data.Bytes. That is considerable work (also mind that we have a custom Word16 implementation for windows).

And it is not yet clear to me what the advantage is. We're usually not dealing with large data when we get OsString from OS API back... reading files does not yield OsString. It is rather things like getting env variables, program arguments or filepaths.

Where this becomes interesting is parsers (which rely on slicing to be efficient). But my idea there would be to:

  1. extract the ShortByteString
  2. convert it to Data.Bytes.Bytes
  3. run the parser
  4. convert back to ShortByteString

For that we would need parser implementations for Bytes, which does not exist yet, afais.

@Bodigrim

Bodigrim commented 3 months ago

I don't think there is much benefit from sliced strings for os-string / filepath use case.

A parser on Bytes from byteslice would be a helpful development, yes.