Open tractorcow opened 6 years ago
cc @silverstripe/core-team for review :D
I guess the major question is here, which classes should have which roles for schema + instance:
And the other question is, is shifting config off DataObject subclass ok? Most people are used to adding it to their model + using it straight away. It may be a pain to users to need 2 classes.
This is great. Thanks, @tractorcow. I don't have a lot of time to write up a full response now, but in general, I love the direction. A long time ago, I attempted to solve this problem in a similar, but much less complex way by having a universal DataObject extension that acquired class-specific schema and CMS fields from YAML. In my experience building moderately complex websites, anywhere between 50-80% of DataObjects required nothing beyond YAML declarations, and that was really valuable, particularly for getting a site going quickly, and also making it accessible to designers and other team members who lacked PHP proficiency.
Architecturally, I think I have a hard time opposing any well-engineered plan to deconstruct the DataObject
monolith into a more aspect-oriented approach, with smaller bodies of single concerns. Decoupling the schema from the business logic seems like a great start.
A declarative way to do getCMSFields and exposure to ModelAdmin seem like really good ways to reduce boilerplate code, as well, but that's another RFC.
Anyway, very keen to hear from others on this!
This means that, for instance, we can create test class dataobjects which do not require us to create a test file. These can exist purely in YML.
I'm really excited about what this will mean for creating fixtures! SapphireTest performance with fixtures is another topic, but it would help so much if we could fixture say a Member with only YAML.
extends: Page # Virtual parent class. Maybe not necessary?
Perhaps if we called it inherit
(or even inherits
for multiple) it might not draw the assumption attached to extends in PHP? You'd be kind of doing this with schema: PageSchema
already though.
In my experience building moderately complex websites, anywhere between 50-80% of DataObjects required nothing beyond YAML declarations
Yep, this is so true
Re: the instantiation of virtual classes via DataObjectSchema::get_one($name)
etc, I kind of feel like we should be promoting some sort of dependency injection as a higher priority in SS5 - maybe implementing something like this into Injector would be a good start? For example, something like Injector
has getItems($name) : DataObjectSchema[]
and getItem($name) : DataObjectSchema
I had a bit of a think about this last night; I think creating two classes for a dataobject is unnecessary.
So I propose:
DataObjectSchema
as the schema instance. We shouldn't need to subclass this; They are model-specific instances, so the config will be injected into each instance of @DataObjectSchema
.ModelManifest
to store and manage the list of schemas.I like this path.
At a high level, I'd see it like so:
Not sure:
I'm not entirely sure that this needs to be change/major
- it could potentially be a minor change if we preserve the existing API.
I'd probably call the interface DataType
and make it completely absent of static methods.
You might have ClassBasedDataType
which derives a data type from DataObject static properties, preserving current behaviour.
foreach(ClassInfo::subclassesFor(DataObject::class) as $dataClass) {
$types[$dataClass] = new ClassBasedDataType($dataClass);
}
DataObjectSchema would probably be deprecated and/or rewritten to pass all its calls through to methods on the type object in question. Something like this (which assumes each type is a prefixed name in Injector; there are other approaches such as a DataTypeSet service)
public function DataObject::tableName($class)
{
return Injector::inst()->get('DataType.' . $class)->tableName();
}
This would keep us from breaking the DataObjectSchema API.
You could add ConfiguredDataType as a separate implementation, that used property setters rather than statics:
$configurableDataType = new ConfiguredDataType();
$configurableDataType->setTableName('orders');
$configurableDataType->setFields([
'OrderID' => 'Int',
'Price' => 'Decimal',
]);
$configurableDataType->setHasOne([
'Member' = Member::class,
]);
$types['Orders'] = $configurableDataType;
...which can then easily be built inside Injector yaml.
Just going to point out that if you can do this as a minor change, it gets into most of the community's hands within months rather than years.
I'd be interested to see how this could fit in a minor change release, it seems like a big paradigm shift at face value.
The concept is really powerful and it's really nice to see. I will greatly miss being able to easily discover and traverse classes in IDEs...
Looks like we'll utilise FormFactories
more with this approach coming up.
I like the idea of making big changes by making a new API rather than radically changing an existing one.
Then you can deprecate the old API...
And in the next major release delete it.
Easier said than done in some cases but a good principle to try for.
gets into most of the community's hands within months rather than years.
SS5 is only nine months away, if we're going to keep ourselves honest. By the time this gets into a minor release, it would likely be less than six months away, so I would weigh the cost/benefits of not breaking any APIs.
I will greatly miss being able to easily discover and traverse classes in IDEs...
Yeah, this is a huge drawback. I'd expect the natural outcome would be 100+ line, all-inclusive YAML files, which actually becomes a disservice to discovery.
I like the idea of making big changes by making a new API rather than radically changing an existing one.
Yup, that sounds fine, as long as old API is not dramatically changed to transition the new API in, then that could be alright.
It's not uncommon to overlook the old API functionality when the new API is transitioned in though.
100+ line, all-inclusive YAML files
I hope it doesn't lead to practices of monolith YAML file(s), that would be worse :P
On the "100 lines of YAML, no more code hinting" risk: my assumption is that most DataObjects would still use the original PHP approach, and this tool would only be used where there were no special features of the object.
A particular use-case I could see for this is to allow the creation of new page types with no extra PHP code. You could, for example, build on this with a feature in a theme that injected new page types, meaning that a theme could provide a new page types independently. It's beyond the scope of this ticket but it's a way it could grow.
It's probably worth noting that allowing this use case would also mean we're being a bit more honest with our API. We currently have DataObjectInterface which is useless outside of DataObjects, so this would be a good example of us dogfooding our interfaces a bit more.
At the same time it's also a great excuse to do some refactoring in DataObject =)
I thought I would put in my own two sense on the matter even though I am not a core contributor. Overall I think tractorcow's original idea is great and will help developers who are writing tests immensely.
I do however think that you may want to create a dedicated configuration file (model.yml
or schema.yml
perhaps?) file. Otherwise, I think this change might cause some consistency issues for developers coming to a new project or module.
Right now developers can reasonably assume that a project or module will store it's DataObjects files in a Model
folder (under src/
). It would be nice if you could store the yaml file for "Virtual DataObjects" within the same folder.
There is also the question of whether storing "Virtual DataObjects" in the main configuration file is appropriate for anything other than testing purposes.
P.S. I don't know if issue threads are normally open to community feedback so I hope I am not speaking out of turn here. Please take my comment as a grain of salt.
P.S. I don't know if issue threads are normally open to community feedback so I hope I am not speaking out of turn here.
Not at all, we always welcome community feedback - on issues, RFCs and pull requests 😄
This might be trickier than it seems. I've looked into adding "mock" dataobjects from PHPUnit tests into the manifest so that we could do proper testing of mocked dataobjects and having them show up in inheritance heirachies and such, but the nesting feature of the manifests doesn't appear to work as expected... So that needs fixing to get this working, I think..
Background
This RFC proposes a solution to allow a new dataobject specification to be built using only config.
This proposal will allow us to split the class base for specific dataobject specifications away from requiring explicit classes.
This means that, for instance, we can create test class dataobjects which do not require us to create a test file. These can exist purely in YML.
Details
Class structure
I propose an extension of the DataObjectSchema implementation. Ever since I wrote this feature I've felt that the implementation was out of place and served no real purpose, other than "make dataobject.php shorter".
The basic proposal is that each logical-class will be backed by two real-classes:
As with Page and PageController, the relation between the two will be based on naming convention. The default "schema" class for any model is
<FQN>Schema
. YML classes don't need to conform to this convention.Schema registration
The registration of all models will be based on the union of:
That meaning, creating a subclass of DataObjectSchema will automatically register that as a model. These classes are self-describing, meaning that config such as
db
,has_one
, and so on will be drawn not from theBlogPage.db
, but fromBlogPageSchema.db
. In fact,BlogPage
class may not even exist.YML Registration for objects would be:
schema
andinstance
would just default toDataObjectSchema
andDataObject
if omitted, so they aren't mandatory.We could instead make the schema / instance default to the parent
extends
model.Parent db / has_one etc would inherit from EITHER (but probably not both) of
extends
orschema
. That may require some thinking. :)Typical convention would be that, for virtual schemas, there is no namespace.
Schema retrieval
Similar to how
DataObject::get()
/::get_one()
can be used to query instances of a dataobject, you can useDataObjectSchema::get_one($model)
to get a schema for a named model.Mosts tasks that previously used DataObject::singleton() to do things on the schema (such as build the DB) now use DataObjectSchema::get_one($class).
DataObjectSchema::get() will return all schemas for the current db.
API
Options / variants
Since we are moving the config for model specification to
DataObjectSchema
, it may be less disruptive instead to keep the schema onDataObject
, and make that the schema class (DataObjectSchema
goes away).We would instead have a
DataObjectInstance
(or justDataItem
for short).