Closed schvarcz closed 8 months ago
Hello guys.
@vadz answered me on the PR:
Thanks for working on this!
I don't use rowset API myself, so I'm not too familiar with it and could be missing something here, but I'm a bit surprised by the need to modify the code not clearly related to rowsets to implement support for blobs there when they do work just fine with just use() and into() already (which I do use in my own code). It would be best if this could be avoided, i.e. if the backend files could remain unmodified.
Also, thank you for adding a test, but ideally the test should work for all backends, not just SQLite, and actually test that retrieving the blobs works, too.
Thanks again!
I will answer it here since the conversation seems to go beyond my initial expectations. I will explain what I mean.
First things first, thanks once again for your reply! It seems that everything is well connected inside the code, which is good, but sometimes takes a simple modification a long journey into the code. Which is good to learn the philosophy adopted here.
Yes, I want to implement these tests for the other backends. I need it working at very least on SQLite3
and Postgres
(ideally on MySQL
too). The remaining backends I will implement to keep the SOCI
compatibility as it should. (See more on the "obs" section)
However, while I was reading the code related to blob
, I realised that currently SOCI
doesn't have a regular compatibility between different backends for this data type.
You said you are using use()
and into()
with blob
, right? So I presume you are using or SQLite3
or Firebird
, is that correct?
Currently, it seems that SQLite3
and Firebird
are the only ones supporting use()
with blob
functions, as you can see in the tests for ORACLE and Postgres. (SOCI does not support blob
for the others backends).
USE 1: SQLite3
and Firebird
do use the function use()
to save modifications as any other data type:
{ // For SQLite3 and Firebird
blob b(sql);
sql << "select img from soci_test where id = 7", into(b);
CHECK(b.get_len() == 0);
b.write(0, buf, sizeof(buf));
CHECK(b.get_len() == sizeof(buf));
b.append(buf, sizeof(buf));
CHECK(b.get_len() == 2 * sizeof(buf));
sql << "update soci_test set img=:blob where id = 7", use(b); // Save it on DB
}
a similar code can be seen here.
USE 2: While Postgres
and ORACLE
save the modifications on the database at the moment you call the function write
.
{ // For Oracle and Postgres
blob b(sql);
sql << "select img from soci_test where id = 7", into(b);
CHECK(b.get_len() == 0);
b.write(0, buf, sizeof(buf)); // Save it on DB
CHECK(b.get_len() == sizeof(buf));
b.append(buf, sizeof(buf)); // Save it on DB
CHECK(b.get_len() == 2 * sizeof(buf));
}
code copied from here.
Which forces the user to first create a dump empty blob
inside the database with a SQL
command:
{
// Create a empty `blob` on `Postgres`
sql << "insert into soci_test(id, img) values(7, lo_creat(-1))";
// Create a empty `blob` on `Oracle`
sql << "insert into soci_test (id, img) values (7, empty_blob())";
}
code copied from here and here.
Which doesn't make too much sense to me.... In order to insert a new information you have to run an insert
, followed by a select
, to only then insert
your blob
data.
The version implemented on SQLite3
and Firebird
makes more sense to me. It is more similar to how we modify any other value on the database.
One could argue that blob
data is saved as a string
on SQLite3
, where Postgres
saves it in a file apart from the table itself. However, Firebird
does a similar job as Postgres
. Nevertheless, it is implemented as it is. Besides, imho, an ordinary user does not pay attention to these details on a regular use. Also, we could still provide a similar interface to an advanced user as it is today.
After digging the code, it seems to me that there is a confusion on the roles of class blob;
and class blob_backend;
. IMHO, the first should be a container into where the blob
data navigates inside the code, similar to what class holder;
does for the general use. The second one should be the backend functions to actually insert the data into the db. I believe this confusion came from the time when SOCI
supported only Oracle
and didn't use to need an abstract transport layer to support many backends. Just a guess.
USE 3: Also, for the other variable types, SOCI
seems to support a "later evaluation" (or as you wanna call) to simplify the data input:
{
statement st1 = (sql.prepare <<
"insert into soci_test (id, name) values (:id, :name)",
use(id), use(name));
id = 1; name = "John"; st1.execute(1);
id = 2; name = "Anna"; st1.execute(1);
id = 3; name = "Mike"; st1.execute(1);
}
code copied from here.
Which currently will not work with blob
data on any backend. I could try to implement something for this, but since it is not currently supported by SOCI
, I will let it just as a comment for now.
Actually, the simple fact of re-using a blob
class will yield errors:
{ // Another issue on Firebird (I believe that it doesn't affect SQLite3)
blob b(sql);
b.write(0, buf, sizeof(buf));
CHECK(b.get_len() == sizeof(buf));
sql << "insert into soci_test(id, img) values(1, :blob)", use(b); // Saves the blob on db.
sql << "insert into soci_test(id, img) values(2, :blob)", use(b); // Saves an empty blob
}
I am adding this test on the tests batch and fixing where it is happening.
Only a suggestion: Talking about "later evaluation". It would be nice if we could have a blob
function just to mark that a variable should be treated as blob
by SOCI
, and actually takes care for us of all necessary transformations.
{ // Only a suggestion
std::vector<int> vec1 = {1, 2, 3, 4}
statement st1 = (sql.prepare <<
"insert into soci_test (id, m_list) values (:id, :blob)",
use(id), use(blob(vec1)));
id = 1; st1.execute(1); // Saves {1, 2, 3, 4}
id = 2; vec1.push_back(5); st1.execute(1); // Saves {1, 2, 3, 4, 5}
}
{
std::vector<int> vec2;
sql << "select m_list from soci_test where id = 1", into(blob(vec2));
print_vector(vec2); // Prints {1, 2, 3, 4}
}
I think it is doable, but it is out of scope of the discussion here and/or my initial motivation. I am letting it here just as a suggestion.
My initial goal was to support blob
on rowset
. So I can retrieve binary data on a table-like format, together with other informations, as I need for a current application we have here with SQLite3
that is migrating to Postgres
and/or MySQL
. I did have to change a thing or two to allow this use on Firebird
and very likely I will need to change to allow on Oracle
too. I am not going into details here, but the appropriated tests are going to be implemented on all backends too.
In short (or in tables). Current blob support:
|
Oracle | Firebird | Postgres | SQLite3 | MySQL | DB2 | ODBC |
---|---|---|---|---|---|---|---|
into() |
✅ | ✅ | ✅ | ✅ | 🚫 | 🚫 | 🚫 |
use() |
🚫 | ✅ | 🚫 | ✅ | 🚫 | 🚫 | 🚫 |
use() w/ later eval |
🚫 | 🚫 | 🚫 | 🚫 | 🚫 | 🚫 | 🚫 |
rowset() |
🚫 | 🚫 | 🚫 | 🚫 | 🚫 | 🚫 | 🚫 |
Where I am aiming: | Oracle | Firebird | Postgres | SQLite3 | MySQL | DB2 | ODBC |
---|---|---|---|---|---|---|---|
into() |
✅ | ✅ | ✅ | ✅ | ❓ | 🚫 | 🚫 |
use() |
✅ | ✅ | ✅ | ✅ | ❓ | 🚫 | 🚫 |
use() w/ later eval |
🚫 | 🚫 | 🚫 | 🚫 | 🚫 | 🚫 | 🚫 |
rowset() |
✅ | ✅ | ✅ | ✅ | ❓ | 🚫 | 🚫 |
OBS 1: If you wanna follow the CI tests, I am running them here. This repo has a good CI pipeline implemented for GitHub Actions, idk why it is not running on my PR here. So I forked the project to run it there. OBS 2: Just so you know, I wanna work on the final backend (Oracle) tonight or tomorrow. If everything goes fine, a PR will be ready to evaluation on Monday's morning.
Sorry, I didn't have time to look at this in details yet, but we do use BLOBs (or at least CLOBs, for XML storage, but I think they are very similar) with Firebird, Oracle and PostgreSQL, so I am not sure what the problem is. I don't think we ever reuse them, however, so there could be bugs when doing it.
Another question I have is what exactly do you mean by "good CI pipeline", i.e. how is it better than what we have? If you have any improvements to suggest for our CI workflow, please don't hesitate to make (another) PR for it, TIA!
I mean your CI pipeline, the GitHub actions one. For some reason, yours GitHub actions is not running on that PR, so I created a fork just to run yours GitHub Actions CI pipeline.
In my case here, we need a library that runs the exactly same code on different backends, which is not the case for blob
right now. When you would have a time, please check it.
If you think it is not from SOCI
's interest to normalize this use, I will keep that code on my fork and use from there.
The CI should run now, it just had to be manually authorized for the first run by a new contributor.
And of course all the backends should work the same, ideally. Anyhow, let's wait until your PR is finalized and discuss the final version when it's ready.
looking forward to this!
Hello Gabriel.
It is almost done for all engines. I am missing just oracle. But I work got me and I ended letting it aside for a time.
If you would like to help finish and/or would like to use to another backend, be my guest to checkout on my branch.
Kind regards
I think all of the mentioned features here (except for the "tell soci to use arbitrary objects as blob-like data") are implemented in #992
@schvarcz Have all these suggestions been implemented? Thanks
Hello everyone.
First, thanks for this very useful library. We are really looking to use this library on our company due the support to several backends.
However, I see some room for improvement where I hope I can help.
On our case, we have heavily use the
soci::blob
. Right now, I am in a situation where I have to use it withsoci::rowset
. However, when I do something likem_rowset.get<blob>("m_blob")
,soci
returns the error:I do understand the nature of the problem and why
soci::blob
needs to be initialised withsoci::session
. I also can see that similar issues were raised in the past, with the same nature.Since I can't see another way to retrieve
soci::blob
objects in a bulk way, I had to dig into it.I believe that we have 3 main points to check when we are talking about retrieving information with blob:
For the first case, the solution seems to rely direct on the
core
, sincesoci::rowset
has a pointer tosoci:statement
; andsoci::statement
has a public pointer tosoci::session
.If we can rearrange
soci::rowset
andsoci::row
so thesoci::session
pointer can reach the final class, we might solve the issue for the first case. (Ps: we also should be careful for the cases ofsoci::rowset<soci::blob>
.I would like to know what do you think about this solution. It seems to me that we are supposed to have a method to retrieve blob data in a bulk manner, since it is available to all other types and it seems to be a recurrent issue report here. If you agree with this implementation, I can work on that this week and submit a PR. Otherwise, I am open to other solutions.
Very likely I will dig into the case 2. in the near future.
Kind regards