lspitzner / pqueue

Haskell priority queue package
Other
15 stars 13 forks source link

Read instances are somewhat unsafe #117

Open treeowl opened 1 year ago

treeowl commented 1 year ago

We currently have

instance Read a => Read (MinQueue a) where
  readPrec = parens $ prec 10 $ do
    Ident "fromAscList" <- lexP
    xs <- readPrec
    return (fromAscList xs)

If someone uses this to read from an untrusted source, they can end up with a queue that doesn't obey the order invariant. How bad is that likely to be? I have no idea, but it makes me a bit nervous. It's also conceptually wrong: a parser for ascending lists should produce a parse error if the list is not ascending.

It seems the "right" thing is to check whether the list is indeed ascending. It would also be good to add a parser alternative for fromList, which wouldn't require an ascending list.

Unrelatedly, it seems a bit odd to parse the whole list before starting to build the queue. To be perfectly correct, I think this is unavoidable, but if we're willing to be just slightly sloppy, we should be able to weave the queue construction into the list parsing, and fall back to the separate parsing if that fails. This is technically wrong, because readListPrec for a type doesn't have to accept standard list syntax or (if it does) to interpret it in a standard way, but I don't know of any types that don't accept it, and I'd be a bit horrified if any accepted it with a non-standard interpretation.

konsumlamm commented 1 year ago

It seems the "right" thing is to check whether the list is indeed ascending. It would also be good to add a parser alternative for fromList, which wouldn't require an ascending list.

Agreed. We can argue what it should do with "fromAscList" if the list is not ascending, should it throw an error or just use fromList?

Unrelatedly, it seems a bit odd to parse the whole list before starting to build the queue. To be perfectly correct, I think this is unavoidable, but if we're willing to be just slightly sloppy, we should be able to weave the queue construction into the list parsing, and fall back to the separate parsing if that fails. This is technically wrong, because readListPrec for a type doesn't have to accept standard list syntax or (if it does) to interpret it in a standard way, but I don't know of any types that don't accept it, and I'd be a bit horrified if any accepted it with a non-standard interpretation.

I don't really care about that.

treeowl commented 1 year ago

It seems the "right" thing is to check whether the list is indeed ascending. It would also be good to add a parser alternative for fromList, which wouldn't require an ascending list.

Agreed. We can argue what it should do with "fromAscList" if the list is not ascending, should it throw an error or just use fromList?

Throwing an error is not an option, IMO; this is an input error and not a program error. That leaves two options:

  1. Make parsing fail. This feels right from a semantic standpoint. However, it's wretched from a user experience standpoint because ReadPrec parsers don't actually support error reporting at all (the argument to fail is thrown away). All the user will know is that their input didn't parse, and they'll be on their own to figure out why.
  2. Accept the input anyway and put it in order. While I'm slightly squicked by the semantic mismatch, I think this is probably the better choice given the limitations of the parsing type.

How should 2 work, operationally? I've done some experiments with a slightly different queue type, and found that interleaving list parsing with queue building is a substantial optimization. So I think when we see "fromAscList" we should probably start building the queue the way we do in fromAscList, but watching to see if it's actually ascending. If we hit a descent, we should switch mechanisms and build the queue the rest of the way using the general mechanism.

treeowl commented 1 year ago

All that said, just rejecting non-ascending inputs given "fromAscList" is definitely easier, so I'd be mostly okay with that.

konsumlamm commented 1 year ago

I'd be fine with either option. I don't think we need to care about interleaving list parsing and queue building, as I think/hope noone uses the Read instance for anything serious (I can't imagine any use case for it).