Open lejoko opened 1 month ago
Pretty interesting. I think there will likely be some kind of explicit solution required here. For example, we can add a DSL option:
identity_index_lengths ....
and then we can require that if you have a :string
type attribute, that you configure this. Additionally, we have a migration_types
configuration that could potentially be used here. So like unless you set a migration type indicating a specific size, we require that you set a corresponding identity_index_length
. IIRC you can do things like:
migration_types field: {:varchar, 255}
PRs welcome on this front! I won't have time to implement this myself in the foreseeable future unfortunately.
Hmm... not sure I understand your idea with identity_index_lengths. Let me start by restating things another way, because I feel that I may have been a bit fuzzy in my previous message:
VARCHAR
rather than TEXT
for most strings (probably a cultural thing dating back from the time when text fields had problems that varchar did not have) and therefore they usually do not have to specify an index size. It would be surprising to have to do it every time.As a result of all of that, in the current state, all migrations generated with an index on a string field need to be manually changed.
As a first step, I'd like to change the driver in a way that would remove the need to change migrations manually in most cases, even if the change is not perfect. For that part I see several options:
:string
type instead of :text
I have a bias to option 1, for the simple reason that I know how to implement it and have currently no idea how to implement 2. If you think that 2 would be a better idea, I'd be happy to have pointers to where to look to be able to implement it.
At a later time I/we may implement a cleaner solution that might auto-calculate fields size depending on the number of index fields and/or use a new dsl option
Thoughts ?
š¤ The point about identity_index_lengths
wasn't necessarily that the name or idea was right, just that we can solve some of these kinds of problems (i.e they've asked for an identity/index that requires an index size) with additional configuration in the DSL.
What I think we don't want to do is have ash_mysql
have a signfiicantly different default handling of strings than other data layers, in such a way that can be relatively easily missed by users.
I think changing :string
to map to {:varchar, some_limit}
would be very surprising for folks using ash_mysql
coming from other data layers. I'd rather introduce a compile time error requiring that an explicit migration type be set for all strings that are used in an identity or index.
This gives us an opportunity to explain to the user why :text
can be problematic in those cases. Then, if they set
migration_types field: :text
, we can require further that they specify a field size, in options like index_field_sizes [index_name: [field_name: 42]]
, and identity_field_sizes: [identity_name: [field_name: 42]]
.
On reflection, maybe there is an incorrect assumption on my end here: can you set a column to type VARCHAR
without a size to avoid this issue? Seems strange if that works, but if it does then sure, we can use :string
š
Ok. I understand that you want ash_mysql to stay as consistent as possible with other data layer. However it still needs to take MySQL idiosyncrasies into account (and MySQL has lots of them... š). I wanted to understand why MySQL people usually prefer VARCHAR over TEXT so I dived a bit into MySQL docs.
From what I understand, contrary to Postgres and Sqlite, VARCHAR and TEXT don't work the same in MySQL. MySQL's TEXT fields are internally BLOB fields. They are not store like normal fields but in a separate space and are instantiated separately in memory. Because of that they have the following restrictions (that VARCHARS don't have):
Apparently, it is not only an historic thing (as I initially thought) but the recommended way is to use VARCHAR rather than TEXT when possible in MySQL.
On the other hand, mysql fields are restricted to a maximum of 65535 bytes for any single non BLOB/TEXT field, which means 16383 chars for an utf-8 VARCHAR field, and the maximum size of all fields of a table is also 65535, except for BLOB/TEXT fields.
When seeing that, I tend to think that ash_mysql should either default to :string/VARCHAR for Ash strings rather than :text/TEXT even if it is not the case in other data layers, or it should force ALL uses of Ash strings in ash_mysql resources to also define a migration_types
for them (and not only those referenced by an index). Defaulting to :text/TEXT with no warning, could be both very surprising for MySQL users, and a source of unexpected problems.
Thoughts ?
You've convinced me :) Lets swap the default.
Wow, that was a fast answer... š
Of course that does not remove the index length problem. it only avoids it in most standard use cases.
Yeah, and it makes them opt-in to the weirdness you've described. They'll have to accept that the maximum length generated for string attributes is 65,535, and if they want more they have to use migration_types field: :text
, which based on your excellent points will be a much better experience than the reverse. š
Thanks for the due diligence and persistence š !
Thanks for Ash once more! The day I read its description, it immediately clicked with me. I was desperately looking for a way to build apps around a descriptive data model in Elixir. I used to be a big Ruby on Rails fan and still love the ease with which you can build complex apps with it, by being mostly descriptive. Ash gives me a big part of the same goodness and other kind of goodnesses, in a full Elixir spirit (more "strict/pure/perfectionist" than the Ruby way but still very pragmatic). the only thing that I still miss is good tooling for building web UIs. But as far as I understand, there are things in the making ?
They are a ways out, but yes :)
Hi @zachdaniel !
I post that here because I see a mysql specific behaviors that I'm not sure how to manage in ash_mysql regarding strings. I'd be happy to discuss it with you.
MySQL has a limited size for indices (768 bytes in MySQL 8.0). When a field is bigger than that size, any index on that field must be created with an explicit size. Ash fields of type
:string
are currently (as in ash_sqlite) translated in migrations to the:text
Ecto type which is then translated to theTEXT
MySQL type. That is always bigger than MySQL's max index size. It means that all indices on those fields currently need to have their migrations manually changed to add an explicit index size . I could make ash fields of type:string
be translated to the:string
Ecto type which is itself translated to theVARCHAR(255)
MySQL type. That one is smaller than MySQL's max index size, and then it wouldn't require manual editing of migrations in most cases (it could still be a required for multi field indices). But in that case, I'm not sure how people could create fields of type:text:
. In the ideal case, I would probably leave it as it is now add an automatic calculation of indices size when required, but that would be rather complex and I'm not ready to undertake such a task at this stage.(The problem can be painful at times, because MySQL doesn't have transactional schema changes, so when a migration crashes because of a missing index size, you're left with a partially applied migration, which prevents both applying other migrations and rolling back the one that crashed... You have to either clean the mess manually or reset the whole db...)