yiisoft / yii2

Yii 2: The Fast, Secure and Professional PHP Framework
http://www.yiiframework.com
BSD 3-Clause "New" or "Revised" License
14.23k stars 6.92k forks source link

Another joinWith() approach #1618

Closed creocoder closed 10 years ago

creocoder commented 10 years ago

Suggest joinWith() removing and making one with() method with following signature:

with($with, $eagerLoading = true, $joinType = 'LEFT OUTER JOIN', $together = false);

to make API more logical.

qiangxue commented 10 years ago

with() is an interface method shared by all DB solutions. However, not all DB solutions support join. So it's better we separate joinWith() from with().

creocoder commented 10 years ago

So as i see the aspect of separation is:

However, not all DB solutions support join.

Ok, fine. Such DB solutions can ignore $together param for example. What other advantage of separating?

qiangxue commented 10 years ago

The advantage of separation is that it is clearer - when you see the term joinWith(), your immediate response is that it will create a JOIN SQL and you should be extra careful. Using a $together parameter to indicate this is not as clear and could be easily overlooked.

creocoder commented 10 years ago

@qiangxue Ok, anyway joinWith() looks for me like a fast hack to close #1581 issue. Dont you think issue was born by eliminating "together" feature from relations? How about more radical approach? I mean something like Yii 1 approach with dinamic relation options.

qiangxue commented 10 years ago

We don't want to go for the 1.1 approach because it involves complicated table/column aliasing, which is not trivial to implement and is also tricky to use. The issue pops up because people are accustomed to the 1.1 approach which mixes the concept of data fetching and data filtering. In practice, it's very possible that you join with some relations while eager loading some other relations. Data fetching and filtering should be separated as designed in 2.0. The joinWith() implementation isn't a hack to #1581. I consider it as an important complement to with().

slavcodev commented 10 years ago

I did not understand the need joinWith so this post does not make sense

Ragazzo commented 10 years ago

@slavcodev lazyWith and eagerWith is more confusing. I think that people who knows that Yii2 AR first select all id's than select related AR by this ids would find more confusing eagerWith than joinWith, because second one is exactly pointing to what sql condition will be executed.

creocoder commented 10 years ago

@qiangxue So the joinWith() role is strictly practical reasons, right? To simplify joins with relations.

nineinchnick commented 10 years ago

I agree that this should be a separate method but I don't like @qiangxue explanation. You shouldn't state that data fetching and filtering should be separated. I think most database system, SQL or not, are faster in performing those tasks than PHP. Even NoSQL systems are advertised as providing an efficient map-reduce filtering method.

I think the most important cons are:

The pros are:

I'm not listing aliasing as a con. Personally I never had trouble with that. Even if it's hard to implement it doesn't state anything about good design.

I like when three types of data loading are defined:

Actually the terms lazy and eager should be well recognized by devs, as it is common to a lot of ORMs. Now introducing them to with and together creates the confusion. I'd say that eagerWith would be better than joinWith, even gramatically.

Recently I extensively audited the ActiveFinder from Yii 1 to add support for MANY_MANY through relations. It's complex, but complete. It's a shame to throw it away. I find it very useful and I build some tools around it.

To conclude, I think it's a very important feature and if it's not done well it can drive a lot devs away from Yii 2.

PS. Actually, the best solution is to use subqueries instead of separate queries and passing list of PK values. That allows cross-referencing tables with some limitations. This is just a crazy idea, I'm going to think about it more in near future when browsing the code.

qiangxue commented 10 years ago

@creocoder Yes.

@nineinchnick The biggest drawback of AR in 2.0 is the potential performance issue for eager loading (even this may not be very true). Other than this, it has all the pros you listed for AR 1.1.

As you may have found out, the current AR 2.0 design allows it to be supported by NoSQL and other non-relational DBMS, and it even supports relation between ARs from different types of DBMS.

Regarding the term "eager" and "lazy", the rule for AR 2.0 is very simple: if you use with() or joinWith(), it is eager loading; otherwise it is lazy loading. For this reason, the terms eagerWith() and lazyWith() are not any more expressive. On the contrary, they may bring confusion (e.g. what does lazyWith() do?)

nineinchnick commented 10 years ago

You're right, it's just a question what terms should we use for eager and eager-er, that is the two eager loading methods I mentioned.

I'd just like to address that the current method of eager loading by passing a list of PK values to a second query feels dirty. I will look into it and try to find a solution.

creocoder commented 10 years ago

@nineinchnick

I'd just like to address that the current method of eager loading by passing a list of PK values to a second query feels dirty. I will look into it and try to find a solution.

You confuse eager loading with query strategy. Eager loading is wide term which related not only to ORMs. Both with() and joinWith() use eager loading methodology, but different query strategies. So your comment is related to query strategy of with() and not related to eager loading methodology at all.

creocoder commented 10 years ago

@nineinchnick

Actually, the best solution is to use subqueries instead of separate queries and passing list of PK values.

Not all DBs support subqueries. Also there is some troubles with subqueries in some legacy mysql versions for example.

qiangxue commented 10 years ago

I'd just like to address that the current method of eager loading by passing a list of PK values to a second query feels dirty.

We had the similar thought initially when we were redesigning AR. But our tests and investigation found that it is not that bad. It actually has many benefits, including the support for relational queries for NoSQL DBs. Performance wise, there is no proof that it would be worse than join (it may be faster in fact.) The only limitation is when you have too many PKs returned, the queries for related records could be too long to be accepted by DBMS. However, this is very uncommon.

creocoder commented 10 years ago

@qiangxue Another idea for this:

$items = MyModel::find()
    ->with([
        'relation1' => function ($query, $options) {
            $options->together = true;
            $options->joinType = 'INNER JOIN';
        },
        'relation3' => function ($query, $options) {
            $query->andWhere(...);
            $options->together = true;
        },
        'relation4',
        'relation5',
    ])

there is no proof that it would be worse than join

I remember times when CActiveFinder::together become true by default with opposite argument :) So to be honest the real reason to avoid joins IS NOT perfomance. Reason is easy caching and some NoSQL dbs troubles. If someone will not use caching he will garantee loose perfomance with NoJOIN approach. Can prepare proofs to show NoJOIN approach without caching VS JOIN approach without caching.

creocoder commented 10 years ago

@qiangxue And little more here:

However, not all DB solutions support join.

Fine, lets name this more abstractly as together. For RDBS its JOIN. I'm sure all NoSQL db solutions have possibility to make queries like that. Am i wrong? If not, where is the problem and why query strategy was changed to:

potential performance issue

strategy ?

qiangxue commented 10 years ago

@creocoder I'm interested to see your profiling results which compares non-join and join queries.

I have pretty much explained why we chose the current design and method names in my previous replies. I think we should spend time repeating the same debate again.

creocoder commented 10 years ago

@qiangxue Ok will create profiling tests then with different environments: local sql server and remote sql server.

qiangxue commented 10 years ago

Cool. Make sure you disable query cache when testing: http://dev.mysql.com/doc/refman/5.1/en/query-cache.html

cebe commented 10 years ago

Fine, lets name this more abstractly as together. For RDBS its JOIN. I'm sure all NoSQL db solutions have possibility to make queries like that. Am i wrong?

No, most of them do not have the possibility to retrieve data in a join-like query. Querys do return records of one type in most cases and if it is multi type then the way the records are fetched is different from relation way but like cross type search or something like that. redis does not have it because it is key value only. elasticsearch can only fetch parent-child relation in one query. Not sure about Mongo. In general you can not assume JOIN to be present in NoSQL DBMS Most of them have different goals and join is the most tricky part in querying data as far as I see so they leave it out as it may not easily be distributeable over a cluster for example.

konapaz commented 7 years ago

Is there any way to load eager data "with" or 'joinWith' for ORM by another database ? I use Yii 2.0.9

For example $companies = Company::find()->joinWith('city')->all();

In model Company I have

public function getCity() {
        return $this->hasOne(\common\models\City::className(), ['id' => 'city_id']);
   }

City is a model that has already configured to another database using:

 public static function getDb() {
        return Yii::$app->get('db_another_database');
    }

But is not work, the generated query does not include the name of another_database.city_table.id in condition LEFT JOIN, so a database exception occurs

I tried

$companies = Company::find()->leftJoin('another_database.city_table AS city' , 'company.city_id=city.id')->all();

But although the generated query is corrected the data of city for each company loaded by lazy method

SilverFire commented 7 years ago

@konapaz Duplicates https://github.com/yiisoft/yii2/issues/5470

konapaz commented 7 years ago

So the best solution I found is

$companies = Company::find()
            ->joinWith(['city' => function($query) { 
                   return $query->from('another_database.city_table')
            }])->all();

So what is your opinion ? @SilverFire

SilverFire commented 7 years ago

If you want to JOIN on plain SQL - yes, it's the best solution. If eager loading in multiple queries is fine for you - use Company::find()->with('city')->all();

keeper-evil commented 7 years ago

https://github.com/yiisoft/yii2/issues/5470#issuecomment-312884738