scala / scala-library-next

backwards-binary-compatible Scala standard library additions
Apache License 2.0
69 stars 17 forks source link

`tailOption` and `initOption` #165

Open julian-a-avar-c opened 11 months ago

julian-a-avar-c commented 11 months ago

For symmetry. Once in a while I need them because the sequence might be empty.

Seq().tail and Seq().init throw errors. Ideally I want those two to be corrected and return an empty sequence, but I understand that might break backwards compatibility.

One other alternative to these is seq.drop(1) and seq.dropRight(1), as they safely work as expected. However, I think adding tailOption and initOption would bring the value of not having to remember they don't exist when I want to have the functionality of, "take the tail, but the sequence was empty".

jducoeur commented 11 months ago

I agree that there are a couple of potentially useful methods here, but I don't think the name is quite right, since they shouldn't return Option -- they should return Seq, same as their drop counterparts. Not sure what to call them...

OndrejSpanel commented 11 months ago

What you expect as tailOption / initOption can be done easily with drop(1) / dropRight(1). See also https://stackoverflow.com/questions/42858306/why-no-tailoption-in-scala

ritschwumm commented 11 months ago

@jducoeur why shouldn't they return Option[Seq]? i'd expect them to so i can e.g. use them in a (Option-based) for-comprehension.

jducoeur commented 11 months ago

You can, but is that really the useful behavior? While I don't strictly speaking object to an Option version, I'm hard-pressed to think of a time when I would want to use it. What I think I would more often want is a safe version of tail or init, which would return Seq. IMO, that's more naturally cognate to the intent of headOption -- the most convenient way to do this potentially-unsafe operation.

(But obviously, this is subjective.)

morgen-peschke commented 11 months ago

You can, but is that really the useful behavior?

@jducoeur in this context, not really. If it were returning something like Option[NonEmptySeq], it would provide greater utility.

Ichoran commented 11 months ago

I guess the deeper question is: what is the point of tail at all? drop(1) is the safe way to do it, and the return type is the same. Throwing an exception is almost always bad practice anyway. Yes, List is built out of heads and tails, but why is tail on everything and not just on List?

If you grant that tail was useful to begin with at all, then the fact that it wasn't originally empty (even if the tail itself is) is potentially valuable. So if you want tail to exist, it's reasonable for tailOption to exist also, because this, unlike drop(1), signals that there had been a head.

julian-a-avar-c commented 6 months ago

I suppose, what actually bothers me is the documentation?

val vector = Vector()
// Collection's "special" accessors
vector.head       // Tells me it can throw
vector.headOption // Should never throw from signature and docs
vector.init       // Throws but doesn't tell me

vector.last       // Tells me it can throw
vector.lastOption // Should never throw from signature and docs
vector.tail       // Throws but doesn't tell me

// Possibilities I see:
extension [A](vector: Vector[A])
  // From signature, should never throw
  def initOption: Option[Vector[A]] = if vector.isEmpty then None else Some(vector.init)
  def tailOption: Option[Vector[A]] = if vector.isEmpty then None else Some(vector.tail)
  // From name, should never throw
  def initOrEmpty: Vector[A]        = if vector.isEmpty then Vector.empty else vector.init
  def tailOrEmpty: Vector[A]        = if vector.isEmpty then Vector.empty else vector.tail
end extension

vector.initOption
vector.tailOption
vector.initOrEmpty
vector.tailOrEmpty

tailOption feels dumb, why have an Option[Seq[T]] when you can have a Seq[T]? But that's what I would expect tail to do, and tail crashes. I didn't realize about this exception until it crashed my program, and having an alternative would help me feel like I'm not about to re-invent the wheel. Calling tail shouldn't unexpectedly throw?

EDIT: I made a dumb typo and flipped the then and else case by cause of "brain fart". I want to acknowledge this here to point out, these are the types of bug that slips in without anyone noticing.

val tailOrEmpty = if vector.isEmpty then vector.tail else Vector.empty
// -- @julian-a-avar-c, 2024

I think my favorite implementation so far is util.Try(vector.tail) followed by .toOption or .getOrElse(Vector.empty), since it's the one I can most easily read over without worrying about the implementation details. I don't think that's a good thing? vector.tail makes me feel the same way implicit nulls do. Option(intOrNull).getOrElse(5).

Ichoran commented 6 months ago

I think my favorite implementation so far is util.Try(vector.tail) followed by .toOption or .getOrElse(Vector.empty), since it's the one I can most easily read over without worrying about the implementation details. I don't think that's a good thing?

No, this is not a good thing.

If you want the whole list save the first element, safely, use drop(1), not tail. Try(vector.tail).toOption.getOrElse(Vector.empty) is an extremely long and inefficient way to get the same behavior as drop(1).

Likewise with dropRight(1) vs Try(vector.init).etc.etc.

If drop(1) isn't an easy read for you, the best solution is to practice reading it until it is! The feature is already there, universally supported across all collection types. You just have to use it.

julian-a-avar-c commented 6 months ago

@Ichoran I think it's useful to think in terms of head and tail (or init and last) in many situations from my experience. As such, I don't want them gone. That said, they break. So I think headOption and lastOption are both good ideas. If maybe with different implementations and names.

If your thoughts are to tough it out and use drop(1) because it's already there (lemme be snarky and say that tail is already there also :smiling_imp:), then please consider that the point of this thread (from the top) was to acknowledge their existence, but to share the shortcomings of tail and family since we are getting ready for the next Scala standard library. With your presented criteria, we should be getting rid of/deprecating headOption, and telling everyone to use vector.take(1).

In the case adding vector.tailOption or alternatives is unfeasible for technical or not reason, and since we already acknowledge tail's value as it stands inside of the current standard library), I would suggest adding documentation to tail to indicate its exceptions (Which I have no idea what that looks like, NoTailException? Right now we are getting java.lang.ExceptionInInitializerError). And also a comment highlighting that you can use _.drop(1) instead as is done in some methods of the library already.

tail and init have already proven themselves. They are used and useful as far as I am aware.


Lastly and for completeness, I would like to acknowledge with you that Try(vector.tail).getOrElse(Vector.empty) is not a good idea. But what I was referring to before, is that I also don't think it's a good idea that when thinking in terms of heads, tails, inits, and lasts, that the monstrosity from before is the easiest to write and read and least error prone way of writing if both of those things are of the top-most importance. And that isn't a good thing.

I don't think anyone here is advocating for Try(vector.tail).getOrElse(Vector.empty) (except maybe the author of the SO answer, but even he/she/they mentioned other methods being better fit), it might be easier to read when you don't know that vector.tail can fail, and in some non-performance-intensive places it might even be worth it, because vector.tail could crash your program, and it's not intuitive or documented.


Let me know if any of this sounds appealing, and if I can be of help.

Ichoran commented 6 months ago

I agree that for completeness tailOption and initOption are a good idea, as safe alternatives to the unsafe tail and init, to parallel headOption and lastOption. And certainly the tail and init docs should say that they will throw an exception if the collection is empty!

But what I don't agree with is that someone reaching for Try(xs.tail).toOption.getOrElse(Vector.empty) instead of drop(1) indicates anything but that they're along the path of their process of learning Scala. It does take a while to learn the standard library for collections; it's extremely versatile but this does mean it's not small. It isn't the easiest or clearest way to get the desired behavior except when someone has partly learned the collections API (if they were comfortable with it, they'd drop(1)) and partly learned the Try interface (if they were comfortable with that, they wouldn't add the superfluous toOption; Try has getOrElse itself).

I think we should structure the documentation to help people who are learning. However, I don't think we should structure the library to "help" people who are trying, because of lack of familiarity, to build from the parts they know functionality that already exists. Rather, we should help them learn the functionality that exists.

(Note: The reason to use tailOption is not if you immediately want to getOrElse(Vector.empty) it. It's if you want to use Option to handle the conditional logic about what to do if there is a tail and if there is not.)

julian-a-avar-c commented 6 months ago

But what I don't agree with is that someone reaching for Try(xs.tail).toOption.getOrElse(Vector.empty) instead of drop(1) indicates anything but that they're along the path of their process of learning Scala

Ok... Full stop... Moving on...


I'm aware of tailOption marking if there was or not a last element, since Seq[T] would not as opposed to tailOrEmpty. Thank you for letting me know regardless, and please read the code I wrote?

As a side note, I believe tailOption is a worse design, since for knowing if there is a head, a pattern match might be more idiomatic. But tailOrEmpty says "I want to use the tail and I don't know if it exists so give me an empty instead" which is what tail does if there is only one element.


I suppose what I'm asking is:

What are the reasons to avoid this design? And what are the reasons to add it?

As I see it, there are only good things from tailOption or tailOrEmpty. And I also want to help Scala in any way I can. An intuitive standard library makes an intuitive language.

julian-a-avar-c commented 6 months ago

In fact, I would say I'm advocating for tail failing if there is a single element in a collection, but I understand that would be shocking for existing users, AND would break compatibility.

julian-a-avar-c commented 6 months ago

If you're interested, and I'm suggesting here once more, perhaps if we wanted to more closely model the tailOption meaning, "get me the tail or nothing" would be more like:

extension [A](vector: Vector[A])
  def tailOnlyOption: Option[Vector[A]] = if vector.length < 2 then None else Some(vector.tail)

That is to say, if there is 0 or 1 elements (no tail), return None, else the tail, as opposed to Some(Vector.empty) if there is a head. But I advocate for tailOrEmpty.

Ichoran commented 6 months ago

In fact, I would say I'm advocating for tail failing if there is a single element in a collection, but I understand that would be shocking for existing users, AND would break compatibility.

Part of learning a new language is getting used to the standard ways to do things. Maybe it makes more sense to you that tail only works if there is a nonempty tail, but everyone used to Scala understands it as any kind of tail, including an empty one (but it must be a tail: there must be a head). People have developed the intuition that empty collections are fine, and that head/tail are low-level risky operations while drop(1) or match are higher-level safe operations (if you cover all the branches of the match, which the compiler will warn you about if you don't).

I agree with you that one could define things differently. It's not insensible a priori. It just isn't how Scala works, and I don't agree that it's more intuitive in general.

I mean, you could also argue that you should index from 1 in Scala instead of 0. Who starts counting at 0, anyway?! R and Matlab do, in fact, index from 1. But most languages, including Scala (and Java, and Python, and C/C++, and JavaScript) index from 0. If you wanted a new low-level operation keep(start, end) that would throw an exception if either bound was wrong, but start and end were 1-indexed and inclusive end because it's "more intuitive", it would, in Scala, confuse the heck out of everyone because the intuition is for 0-indexing and exclusive end.

So, some things can be made more intuitive within the context of the language, but mostly by making the patterns more regular. If you have take and takeWhile and takeRight, but not takeRightWhile, that's kind of a glitch in the pattern. Likewise, to have head (unsafe) and headOption for safety and monadic handling, but not tail with tailOption for safety and monadic handling, is a glitch in the pattern and it's more intuitive if it's fixed. However, it's less intuitive if for tailOption, but nothing else, the tail has to be nonempty. That doesn't make things more regular, it makes them less regular, breaking intuition for anyone who has taken enough time to develop it. And to have nonEmptyTailOption is possible, but the collections library is already huge and once you start having non empty collections you start wanting type support for it (otherwise the compiler can't help you keep track of when something is there and when it might not be), and then you end up with something like NonEmptyList which Cats already has--so just use Cats if you want to work that way!

If you make your own language or even your own library, you can make these decisions anew. But existing well-established regularities should be left alone, and the bar to add new variants should be quite high--mostly finding very common use cases that are only very awkwardly supported by the existing library.

(There are a still a surprising number of these, despite the already-large library.)

But tailOrEmpty says "I want to use the tail and I don't know if it exists so give me an empty instead" which is what tail does if there is only one element.

But that has the behavior of drop(1). We don't need another way to spell it. If we were going to spell it differently, we should probably spell it dropOne. But why? drop(1) is pretty clear, once you know how drop works.

julian-a-avar-c commented 6 months ago

I see, thank you for your responses. I suppose my vision was shortsighted.