Closed bausshf closed 2 years ago
Ping me on this once CI is fixed, we need to get these fixed. All my REST routes also need to be safe, and mysql-native in general doesn't make this easy.
Ok.
Previously, mysqln wasn't worrying too much about pure, safe and such for much the same reason - because vibe and phobos didnt make it very easy. Hopefully that's improved now.
Well, considering that vibe.d now requires @safe
just about everything, I think it should work better now. And it makes sense too. The whole point of @safe
is to fix internet exploits. What better place than a web server!
And for the record, I DON'T think a good solution is band-aiding @trusted
on everything. I'm willing to put in some work to get this right. But of course, CI needs to work.
So I started working on this. And I get way way deep down into the nitty gritty of mysql-native. Almost everything is easy to mark with @safe
except a few things.
First, there's a few places where TypeInfo
objects are compared. Object.opEquals is not safe, so I had to make a trusted version of that. Not a huge deal.
The second is a lot harder. Variant
's postblit is not safe. This is required, since any type can go in there. We really need a SafeVariant
type. Really, that's how mysql should be by default, we don't need to support any types that aren't safe.
We could also use TaggedAlgebraic (probably TaggedUnion, as we don't need to be able to forward function calls). This would probably be an improvement anyway, as TaggedUnion provides a cleaner set of mechanisms for basically storing data.
Or we could make our own MysqlVariant type or something that only handles the types we care about. Might be slightly more performant than a Variant or TaggedUnion.
What do you think? I don't want to go any further without some guidance here.
Hmm, I seem to be a little out-of-date here, I’ve used TaggedAlgebraic in the past, but it seems there have been changes/enhancements since then, and TaggedUnion is new to me. I just took a quick look at the docs but I’m still a little unclear: What exactly is the difference between TaggedAlgebraic and TaggedUnion.
In any case, if using Phobos Variant is fundamentally incompatible with proper @safety in mysqln, and TaggedAlgebraic/TaggedUnion would work, then that sounds reasonable to me. Besides, I agree Variant’s API just isn’t as nice or as modern and it’s maybe more heavyweight than needed.
My only concerns:
Looking ahead, if we also want to get more up-to-speed with pure and const-ness, TaggedAlgebraic/TaggedUnion wouldn’t be any worse than than Variant would it? (Frankly, that would surprise me anyway, but I have to ask.)
How feasible would it be have a transition period where BOTH TaggedUnion and Variant are supported? Something like: Using Variant would simply mean they wouldn’t be getting @safe, but if they transition to the new TaggedUnion approach then they’d get @safe? If it isn’t feasible, then that still isn’t necessarily a deal-breaker, but it would be a BIG help if we could avoid breaking peoples codebases, and let them smoothly transition their codebases gradually.
I’ve been away from this a little too long, but as I recall, there’s some stuff where Nullable is combined with Variant to mean certain things. Like distinguishing “no rows/values were received” vs “actually received a row/value and the value is null” (or...something like that…) Things like that. Would a TaggedUnion approach still be fully compatible with this stuff?
As for a “MysqlVariant” type, I would tend to shy away from inventing our own algebraic/tagged union type unless there was a very clear, specific reason why existing options weren’t sufficient and we knew we could do better. And even then, I’d still prefer to have it as an external general-purpose lib rather than having it built-into mysql-native.
Hmm, I seem to be a little out-of-date here, I’ve used TaggedAlgebraic in the past, but it seems there have been changes/enhancements since then, and TaggedUnion is new to me. I just took a quick look at the docs but I’m still a little unclear: What exactly is the difference between TaggedAlgebraic and TaggedUnion.
I've honestly not used either, but I think the difference that I can see is that TaggedUnion does not forward function calls, and has some different APIs. It appears to me that TaggedUnion is more suited for just storing and retrieving a union item. Whereas TaggedAlgebraic tries to provide a seamless API for all functionality of the union members.
In my brief test for this (to see if it forwards @safety
, it does), TaggedUnion didn't work for calling functions.
In any case, if using Phobos Variant is fundamentally incompatible with proper
@safety
in mysqln, and TaggedAlgebraic/TaggedUnion would work, then that sounds reasonable to me. Besides, I agree Variant’s API just isn’t as nice or as modern and it’s maybe more heavyweight than needed.
It might be easy to get a SafeVariant into Phobos, I'm actually surprised it's not there. I think it literally needs a copy-paste of VariantN but with @safe
tagged at the top.
My only concerns:
- Looking ahead, if we also want to get more up-to-speed with pure and const-ness, TaggedAlgebraic/TaggedUnion wouldn’t be any worse than than Variant would it? (Frankly, that would surprise me anyway, but I have to ask.)
Yes, TaggedUnion would be much more conducive to other attributes, as it's a fully templated call instead of a generated delegate.
- How feasible would it be have a transition period where BOTH TaggedUnion and Variant are supported? Something like: Using Variant would simply mean they wouldn’t be getting
@safe
, but if they transition to the new TaggedUnion approach then they’d get@safe
? If it isn’t feasible, then that still isn’t necessarily a deal-breaker, but it would be a BIG help if we could avoid breaking peoples codebases, and let them smoothly transition their codebases gradually.
It would be pretty simple, except for cases where mysql-native's functions are returning Variant. These would be harder to manage, because you can't call an overloaded function based on the expected return value. For sure, the API calls you would make on the return types are much different.
I'll have to look through the code and see how many of these there are. Perhaps it's not that bad.
- I’ve been away from this a little too long, but as I recall, there’s some stuff where Nullable is combined with Variant to mean certain things. Like distinguishing “no rows/values were received” vs “actually received a row/value and the value is null” (or...something like that…) Things like that. Would a TaggedUnion approach still be fully compatible with this stuff?
I'm not sure about this one. I'm sure I can come up with something that works. You definitely can have a TaggedUnion where one of the items is typeof(null)
.
As for a “MysqlVariant” type, I would tend to shy away from inventing our own algebraic/tagged union type unless there was a very clear, specific reason why existing options weren’t sufficient and we knew we could do better. And even then, I’d still prefer to have it as an external general-purpose lib rather than having it built-into mysql-native.
I agree, but I wanted to throw it out as a possible option. One possible benefit is that we could define specialized functions on such a thing.
I'll have to look through the code and see how many of these there are. Perhaps it's not that bad.
OK, so here are the places we would have to worry about this:
Prepared
's arguments all are implemented with variant. I think we might be able to upgrade the argument setting mechanism at the same time as we are adding @safe
, so this may be straightforward to deprecate and replace.Row
values. This is the trickiest one. I can't think of a better API than opIndex, so replacing the return type there would be disruptive. One thing we could do is add a property to get the "safe" row (which would be the actual implementation):struct Row
{
...
SafeRow safe() { return actualRow; }
}
Then you would use it like:
auto id = row.safe[id].value!int;
Not sure if we can deprecate the API completely, and replace it with the safe stuff, which sucks. We could intercept at the Range call, returning a safe Range instead of the normal range.
Variant
or Row
, which would need equivalents for the safe versions.Ok, so it looks like there aren't too terribly many places. So, what if we do it like this:
In version A, we add all our safe non-Variant overloads, we add Row.safe
as you describe, and for the problematic query calls we name the new ones querySafe
, queryRowSafe
, etc. In this same version, we mark all of the Variant-using versions with the deprecated
keyword (don't get me started on that "scheduled for deprecation" goofiness...) with a deprecation message about how the function's API will change in the future and they should migrate to the safe non-Variant versions to be ready.
After a certain amount of time, in some later version B, instead of outright deleting the deprecated Variant-based functions, we just delete their bodies and re-use their names for the safe non-Variant versions (ie, we rename querySafe
to query
, SafeRow
to Row
, etc...). We also remove the deprecated
keywords. We keep querySafe
/Row.safe
/SafeRow
/etc as optional (non-deprecated for now) aliases. Anyone who hasn't transitioned away from Variants will simply start getting type errors instead of the nice deprecation messages.
Later on, version C, we deprecate the now-redundant querySafe
/Row.safe
aliases. Eventually maybe remove them in a version D...or maybe not (at that point it won't really matter too much anyway).
OK, I'll start on an implementation using this idea. We'll see how it goes.
Got stuck on an issue, but so far, I think replacing Variant will be a net benefit. Issue is here: https://github.com/s-ludwig/taggedalgebraic/issues/36
@Abscissa getting back onto this, have a bit of time during Thanksgiving since I'm sick and home alone at the moment. I had a question on this code here: https://github.com/mysql-d/mysql-native/blob/master/source/mysql/protocol/comms.d#L127
I'm basically going to recreate this switch for converting existing Variant to new MySQL tagged algebraic type. But I don't have any support for shared(immutable(...
types. I'm wondering, is this a real issue? I can't imagine you put it in there without it actually coming up, but it seems kind of strange to have that be a real use case. First of all, shared(immutable(x))
is kind of strange, because immutable is automatically shared. But if you had a pointer to it, it's possible the pointer is shared, but the target is immutable?
Some guidance on this would be good, I don't love the idea of supporting shared in MySQL in any case, as the multi-threading implications are murky. Given how Walter is going to remove all usage of shared data without a cast in a very community-supported DIP, it might not make any sense to support shared at all.
My plan at the moment is to simply cast away the shared, as the pointer itself is not really shared (inside the Variant, it's locked away, and we are about to make a copy of it anyway).
In any case, a history of how that got there would be useful.
OK, doing some testing I've determined that what that switch does doesn't actually work. At all:
immutable int x = 5;
Variant v = &x;
const(int)*p = v.get!(const(int*)); // error, incompatible types
So, I'm not as worried ;)
FYI still working on this, haven't tested, but I have something that builds. See https://github.com/schveiguy/mysql-native/tree/safeupdate.
I need to get rid of all the deprecations (internal non-safe calls), and then work on the tests.
Cool.
IIRC, I don't think the "shared immutable" stuff was one of the changes I made. It's possible it might even predate my involvement in the project. So I'm not really sure what the story is there, I'd have to find the commit/pr that introduced it. (I do know it wasn't an especially recent addition, its been in there awhile.) My suspicion is it was probably added together with the "const" and "immutable" cases just as an attempt to be more permissive in what types could be passed in.
/facepalm So much for my memory! It WAS me who added that shared immutable stuff, and less than two years ago, even:
https://github.com/mysql-d/mysql-native/issues/13
With that in mind, I do know I didn't have a specific situation in mind, it was just an attempt to be more permissive. So no worries about nixing that.
More updates: I've decided on a different "radical" approach than what we discussed. I can always move back if necessary.
I found that I can mimic pretty much all of Variant
's features with TaggedAlgebraic
(it really is a nifty library). So I'm comfortable with doing the reverse of what was discussed -- by default everything will now be safe, and if you want unsafe Variant
code, you will use either a new (and deprecated) function asVariant
which will convert the safe type to a Variant
, or you can use unsafe
accessors in the row and range to get a version that converts to Variant
. So this will provide code that doesn't explicitly type anything as Variant
with (I hope) drop-in compatibility. For that case, you can call the stop-gap items mentioned above until you can fix your code.
This is a lot better than what I originally hoped. What tipped the scales for me was that queryValue
returns Nullable!Variant
. It was going to be very painful to switch this and keep compatibility.
We'd also want to be sure to minimize disruptions to user code that SENDS a Variant to mysqln, not just user code that RECIEVES a Variant. So maybe also a fromVariant or asMySqlVal and/or overloads that still accept a Variant FROM the user, to go along with your asVariant.
Also, at this point, if we're offering functionality that converts to/from Variant, do we really even need to use deprecations to make users change their own code to use MySqlVal directly? Maybe we could just leave asVariant/asMySqlVal un-deprecated, and if a user would rather base their own code around Variant and deal with occasional conversions when accessing mysqln and doesnt care about their own code being @safe, then so be it.
Other than that, it all sounds good to me.
The calls that accept Variant
are all still there and overloaded. The ones that accept Variant are marekd @system
and deprecated. I'm working on getting the tests to build, then I'll push what I have.
And I do have a Variant -> MySQLVal conversion there, but I don't want to necessarily expose it as public. It's used internally for all the places where Variant is accepted.
OK, I have updated my branch it's now building tests (haven't run them properly yet). I am waiting on a PR pull for taggedalgebraic, which I very much need to get this to work properly: https://github.com/s-ludwig/taggedalgebraic/pull/40
Going to bed... If I get a chance this weekend, I might do final testing and make a PR. But we still need the taggedalgebraic fix.
So I have got the tests to build and pass. Still waiting on the TA bug fix.
But I realize something that may be a problem with the idea of making MySQLVal
the default. Consider code like this:
auto row = mysql.queryRow(sql);
Variant v = row[0];
if(v != 1) launchMissles();
This compiles with the new system! What happens is Variant
wraps the MySQLVal
(something that wouldn't happen in reverse). I think what will happen is that it fails to run the opCmp, but the message will be pretty confusing.
I may have to go back to the original plan of making everything Variant
and deprecated by default, and then you have to access the new system via safe
accessors.
See my PR at #214
I can close this now. Please let me know if you find any issues with the latest rev (v3.2.0)
mysql\pool.d line: 115 column: 13
The call to ConnectionPool's constructor must use a
@safe
callback.Perhaps createConnection() can just be made
@trusted
?