theam / aws-lambda-haskell-runtime

⚡Haskell runtime for AWS Lambda
https://theam.github.io/aws-lambda-haskell-runtime/
Other
270 stars 48 forks source link

Why String? #38

Open axman6 opened 5 years ago

axman6 commented 5 years ago

There's a lot of usage of "strict" Strings in the package. Is there a good reason not to use Text? Particularly since there's a dependency on Aeson (and probably others), Text is already an indirect dependency. using !String just forces the list constructor, and leaves the tail of the list unevaluated (and also causes an under the hood conversion from Text to String within Aeson).

NickSeagull commented 5 years ago

We were using Text before String (we dont use String in our projects generally), but we found that using Text was a tiny bit slower somehow (I guess because we were doing lots of conversions). There wasn't any thorough benchmarking. But we will do it in the near future, so we can have much better data on that. And if needed, we will switch over to Text again :)

jacobstanley commented 5 years ago

I just came to ask exactly this, measurement ftw.

Have you tried ByteString? My gut says Text UTF-16 can be problematic for sure when you're not doing a lot of string manipulation. Especially since aeson reads utf8 bytes via attoparsec (if I recall correctly) so it needs to be converted back to that anyway.

NickSeagull commented 5 years ago

Yeah, this was going to be the next step @jacobstanley . But still, there are no benchmarks yet 😅 , it'd be great to have something to measure that

axman6 commented 4 years ago

Any update on this? I can't see any way in which String would be faster than Text, since the String needs to be parsed from a Text in the Aeson Value type. I'd really like to try moving some of out lambdas to the native runtime but seeing things like this makes me worry about the project.

NickSeagull commented 4 years ago

The decision of using String was due to what was said above. Most of the inputs are provided by AWS as environment variables, which is read as a String by default, having to convert back and forth between Text and String was troublesome because there were so many unnecessary conversions.

seeing things like this makes me worry about the project

In what way? If your worry is about exposing String to the public API, it could be easily changed.

axman6 commented 4 years ago

Sorry for the late reply on this, and apologies for my misunderstanding. I had forgotten how much was passed to the native runtime as Strings from env vars. I should look into making a version of System.Environment which uses ByteString, to avoid the conversion to String all together...

axman6 commented 4 years ago

If you can bare the dependency on the unix package, you can use https://hackage.haskell.org/package/unix-2.7.2.2/docs/System-Posix-Env-ByteString.html to grab the env vars. Personally I'd prefer to have these as ByteStrings and make conversions to other types if I need them.