zendframework / zend-db

Db component from Zend Framework
BSD 3-Clause "New" or "Revised" License
101 stars 122 forks source link

Cannot access, nor test, the protected variable $this->select in Insert.php #175

Open dmalouf opened 7 years ago

dmalouf commented 7 years ago

In ./src/Sql/Insert.php there is a protected variable $select (line 46). However, this variable is not able to be retrieved directly because

  1. The magic getter (line 275) only retrieves values from $this->columns
  2. ->getRawState (line 158) only retrieves $this->table and $this->columns (keys, values)

As such, it is not possible to access $this->select as it is hidden from the calling-code, thereby making it impossible to test the Select object stored in $this->select. The only workaround at this time is to use ->getSqlString() and use string comparison on the actual SQL string (eew).

Can we provide (read-only) access to $this->select in order to facilitate testing?

alextech commented 7 years ago

I do not think it is that eew in this case. Usage of $select is implementation choice. Technically, that component does not have to use it and is subject to change that implementation. What is important is the end result, not how you got to the end result, for your unit test. You need to verify that the output SQL statement is created correctly and will not cause errors on your database engine. Looking at state of Select object does not accomplish that, but getSqlString() actually does.

Since it is implementation detail, opening up API for it would be misleading, IMO.

But if you want to still test against protected variable, do not let me stop you. You can use reflection to convert protected property to public. That is something I have seen done here now and again so not completely crazy thing to do. During your verification stage make a utility function that will through reflection allow $insertInstance->select usage accomplishing what you want.

dmalouf commented 7 years ago

@alextech... hmmm, you've got me finding assumptions I didn't know I had (always good!).

I get that we approach unit testing a bit differently (we don't test the SQL string output / getSqlString()), so perhaps there are some bits of understanding that I'm not grasping. With that in mind...

Based on what I think you're saying, I'm not sure why any of the getters would be necessary? For me, I'm thinking $this->select should be treated the same way as$this->table, no? Is there a reason why table would be available via getRawState(), for example, but not select? Simply because table is required and select is not (if I'm reading what you wrote correctly)?

alextech commented 7 years ago

Is there a reason why table would be available via getRawState()....because table is required Yes. Raw state supplies data directly relevant and absolutely needed to perfrom INSERT database operation. Cannot insert without knowing table name to insert into, and probably should not insert data into any random column of that table. Pretty useful to get those right.

Everything else is convenience to achieve that goal that can be done any which way up to discretion of whoever submits a PR (as long as maintainable enough). I think there is a PR that fundamentally changes how SQL is generated using Builder Design Pattern that separates SQL code from Zend DB API, as an example.

Then, there is self-documentation issue. In API design what is helpful writing code like "well written prose".

$customerInsertion = new Insert();
$customerInsertion->setTable('customers');
$customerTable = $customerInsertion->getRawState()['table']

reads fine.

$customerSelect = $customerInsertion->getSelect()
what?? As your hypothetical coworker I will wonder "What could he be possibly trying to do here? There is no such thing in SQL as select statement generated from insert operation! Maybe he means select last inserted record to get latest ID or something?" and we go down the wonderful path of confusion.

Want to reflect real world of SQL methodology of database engine as much as possible not deviating too much from it (as with any business logic OOP design in general). We know SELECT is possible to go inside of INSERT statement, so there is a setter for it. But there is no such thing in PostgreSQL, MySQL, MSSQL, to have a SELECT statement appear after running INSERT so adding it to API as if it is possible is misleading. You could argue, "I want to get SELECT for the sake of inspection of how INSERT used it" to which I will reply that you would already have a reference for it when you passed it in line 147. You could pass a mock object with method expectation definitions if you are writing a test that verifies it was called correctly if needed, or a stub and inspect state of stub after Insert finishes its thing, instead of fishing it out from a protected property.

dmalouf commented 7 years ago

@alextech - thank you for explaining yourself more, I do appreciate it.

But I also feel like I'm just not getting what you're saying. So if I may provide a few differences between what you're writing and what I'm thinking so that perhaps (a) I can identify what I'm not understanding in your comments thus far and (b) I can better explain what I'm thinking as I am wondering if I'm miscommunicating.

First, I'm not sure why you are stating that "Raw state supplies data directly relevant and absolutely needed to perfrom INSERT database operation." (emphasis added by me) This seems counter to how the Select getRawState() (here) performs, no? In the code of that link, getRawState() offers numerous items that are not "absolutely needed", no? Is there something different about Insert (compared to Select)?

Second, I was not really suggesting adding a new method as you have in your last comment. I fully agree that what you wrote reads poorly ($customerInsertion->getSelect()). I was thinking something more along the lines of the last line of your '"well written prose"' section... something like this (based on your example above):

$addressSelect = new Select();
$addressSelect->from('addresses')->columns(... ;

$customerInsertion = new Insert();
$cusomterInsertion->setTable('customers');
$customerInsertion->select($addressSelect);

$customerTable = $customerInsertion->getRawState()['table'];
$addressQuery = $customerInsertion->getRawState()['select'];

If $insertObject->select() is never used, then the value of $insertObject->getRawState()['select'] would be null - just like what happens with the Select class when a property is not set. Does this approach "read" differently than the example you gave regarding $customerTable? If so, can you please help me understand?

Again, thank you for your demeanor in this dialog. I really like learning from others, especially when they actually dialog (vs. just making statements without explanation nor taking into account the nuance of what I'm asking).

alextech commented 7 years ago

You are welcome. I also appreciate conversation about different use cases. Can only have so much experience, unless talk to others.

I really like your example. That does make me think I could not be entirely correct. I did not consider possibility of expanding raw state. I was too fixated on idea of getter, for some reason.

About whether INSERT vs SELECT having items absolutely needed. What I mean is, each class is a wrapper to SQL for convenience. SQL query syntax has sections. It is useful to programmatically categorize them into states instead of awkwardly concatinating strings. Maybe I abused definition of word "absolutely" (not my first language, still struggle in that department). Perhaps "exclusively" is better? That is, I did not want to see unrelated syntax managed by a class. SELECT has everything select related, INSERT has everything insert related and no more. But your example, which I just ran in my debugger to see a usecase, does indeed show a flaw in design.

(1) Even if we assume that SELECT query state does not belong to INSERT state, (of which I am not so confident anymore) it is no good that return of getRawState() does not return list of columns that will be affected. SELECT setter should fill those in. I or you can do a PR for that for sure, and see if it will be accepted.

(2) values key being empty is also misleading. It probably should not even exist with INSERT/ SELECT combo.

(3) So the conclusion is that when SELECT is present, its own getRawState should indeed be present, because yes, it does contribute to INSERT's state the way you have it written. Plus, have column list duplicated with end result

[
'table'=>'customers',
'columns'=> ['street', 'house'],
'select' => [] // output of $this->select->getRawState
]

I think I will take back some things I said, pretty good fix. I can make a PR for it, or would you like to give it a try yourself (with full tests etc.)?

alextech commented 7 years ago

or no, take idea of duplication into 'columns' out, since SELECT could be more complex, those keys should just not exist.

dmalouf commented 7 years ago

First, your English is great - the nuance of the word "absolutely" can be difficult for native English speakers, too!

I'm going to poke around at some ideas this morning and make a PR. But I'll be sure to connect here on what I find, what I'm thinking, etc. That is, if still love some feedback even on the PR.

I'm learning a lot from this dialog - glad you're willing to explain your thoughts!

dmalouf commented 7 years ago

@alextech - if you have time, I'd love for you to review, comment, etc. on the PR that is now 'attached' to this Issue (https://github.com/zendframework/zend-db/pull/177)!

alextech commented 7 years ago

In case the PR comments did not make too much practical sense, consider this set of queries:

create table anime (
  id SERIAL PRIMARY KEY,
  name TEXT
);

insert into users(name) values('Rin');
insert into users(name) values('Tomoyo');

create table animeSource (
  id SERIAL PRIMARY KEY,
  name TEXT
);

insert into animeSource(name) values('Fai');

insert into anime
select * from animeSource;

will crash with

ERROR:  duplicate key value violates unique constraint "anime_pkey"
DETAIL:  Key (id)=(1) already exists.

So that conflict needs to be resolved to make sure SELECT does not include the id column, and if it does, replace the value of 1 with anime_pkey.NEXTVAL. To do so columns key will need to be populated fully with whatever SELECT is planning to through against the Insert.

(now that I thought this corner case through, my first criticism that select state is not needed in insert's state is so bad XD)

michalbundyra commented 4 years ago

This repository has been closed and moved to laminas/laminas-db; a new issue has been opened at https://github.com/laminas/laminas-db/issues/94.