softwaremill / sttp

The Scala HTTP client you always wanted!
https://sttp.softwaremill.com
Apache License 2.0
1.45k stars 309 forks source link

Allow configuration of URI path encoding. #306

Closed jonathanhood closed 5 years ago

jonathanhood commented 5 years ago

I'm attempting to deal with URIs that have some special requirements (taken from RFC8040 Section 3.5.3) for how encoding is performed - particularly with regards to how , is handled. As it stands, trying to handle the special case in that RFC "on top" of the existing Uri structure in sttp results in double-encoding since sttp, itself, wants to perform the encoding and is unaware of these special considerations.

Now, asking sttp to become aware of these particular constraints of my admittedly uncommon use case is a bit much. Instead, I'd like some way to inject some configuration (similiar, perhaps, to how QueryFragmentEncoding is handled) that allows me to at least disable or modify sttp's built in encoding.

An example of the configuration structure:

sealed trait PathEncoding
object PathEncoding {
  // (default) Automatically encode all path segments
  case object All extends PathEncoding

  // Disable segment encoding and assume provided segments
  // are already properly encoded.
  case object Disabled extends PathEncoding
}

This would require changes in encoding (during toString) similiar to how query parameter encoding is already performed based this configuration provided during Uri construction.

jonathanhood commented 5 years ago

It looks like this request is, perhaps, similar (at least in proposed solution) to #287.

aappddeevv commented 5 years ago

I think that escaping should be an instance provided to a constructor in some way so escaping can be completely customized. Also adding something like @adamw mentioned using a Literal class would also be a good thing as well for simple cases.

@adamw, what do you think? I'd like to move off a forked version of sttp. Literal would take care of the immediate need and seems useful in the long run anyway.

jonathanhood commented 5 years ago

I suppose the paths could change to be a sequence of some kind of abstract type, e.g.:

sealed abstract trait PathSegment
case class Literal(seg: String) extends PathSegment {
  override def toString: String = seg
}
case class Escaped(seg: String) extends PathSegment {
  def encode(x: String): String = ???
  override def toString: String = encode(seg)
}

case class Uri(path: List[PathSegment], ...) {
   // somewhere in here...
  val encodedPath: String = path.map(_.toString).mkString("/")
}

However, that would at least break compatibility with existing API users. Thoughts on that compatibility issue?

adamw commented 5 years ago

If we are to break compatibility, now is a good time (as there's a lot of breaking changes in the v2 branch already). The only thing is that the changes would have to be done on the v2 branch.

+1 on PathSegment. It's probably a better name than PathFragment, as it might cause confusion with Fragment, which is also part of the uri :). Maybe we should also then have QuerySegment instead of QueryFragment?

Finally, would the encoding strategies be re-useable? So both path & query fragments would consist of a string + an encoding strategy. A potential problem is that the encoding strategies could behave differently depending on context, but maybe this could be made to work consistently.

jonathanhood commented 5 years ago

I suppose, then, I see two paths:

  1. My original suggestion which could probably go in the 1.x release line immediately.
  2. What's suggested here, to be done for inclusion in 2.x.

On my end, I have a rather short term need for this feature and would rather not take Mx releases into production code. As a result, I lean towards the original proposal. That said, I also see value in the greater flexibility of what's being suggested here with PathSegment. So, I can take on a few piles of work:

  1. Just do the work in v2 to introduce the PathSegment content. I can, on my end, fork a release of 1.x which fixes my short term need. I can, later, upgrade to the real 2.x release series when they are available.
  2. Do both and release a 1.x that contains my original suggestion, and then immediately rips it out in 2.x in favor of the PathSegment approach.
  3. Abandon the PathSegment approach completely and go with the original suggestion (which should be able to go in both 1.x and 2.x).

I really don't like the idea of internally forking builds on my end. That said, I can do that. Just looking for some guidance on release timelines and what the "right" path in terms of this library itself.

I intend to raise a PR soon for the original proposal - mostly because I already have the code sitting on my machine. I'm perfectly happy if that gets closed for some other path/solution/whatever, but may as well put it out there for comment.

aappddeevv commented 5 years ago

It might be less of a breaking change to just add a method to Uri that is def unencoded: String that returns the string but not encoded. You can change things underneath the hood to whatever you want in terms of state management e.g. a hidden boolean flag. But at least the only API change is that one method. We would need to change our code though. And in the future unencoded could absorb any flags/types/config things that are decided on.

jonathanhood commented 5 years ago

@aappddeevv I'm not sure how adding that method would help things, but I'm probably missing some crucial insight. After all, the underlying backends would still use toString in order to serialize things which would then apply encoding? Are you suggesting that those backends, based on some kind of configuration, instead call unencoded in some situations? Perhaps you can elaborate more?

aappddeevv commented 5 years ago

It's equivalent to a flag on the object. I probably goofed in returning a string in my comment above. In my hack, I justed added boolean flag to the Uri object to turn off encoding for toString which is like adding your ADT above. Sorry for the confusion. I'm good with any answer that makes it easy though.

case class Uri (...., encode: Boolean = true) {
...
  def dontEncode: Uri = this.copy(encode = false)
}
jonathanhood commented 5 years ago

So, my original proposal was effectively a flag as you describe. The only difference is using an abstract type to declare intention rather than a Boolean. This is a bit cleaner overall because it is expandable if ever needed and more clear in behavior to the caller.

aappddeevv commented 5 years ago

Yes, they are both flags.

jonathanhood commented 5 years ago

@adamw Thanks for the quick merge! Looks like you are on a bit of a PR merge rampage :)

I suppose the question still remains on what we'd like the 2.x API to look like. I'm happy to supply code to that end as well.

adamw commented 5 years ago

Thanks! I think #309 is a good short term fix, but for v2 I'd prefer to have segments, I thinks that's a more flexible solution (more local). I saw you submitted a PR for that, I'll take a look tomorrow/next week :)

adamw commented 5 years ago

I've implemented an approach similar to #310, but more backwards compatible and more flexible, using the idea of having the encoding as an arbitrary function. See: https://github.com/softwaremill/sttp/blob/v2/model/shared/src/main/scala/sttp/model/Uri.scala

@jonathanhood would this work for you as well?

jonathanhood commented 5 years ago

Ya, that seems fine to me. Lots of that actually looks the same as the PR I raised, so should be fine.

jonathanhood commented 5 years ago

Closing this issue, as it now seems to be resolved in 1.x and 2.x sttp releases.