ibis-project / ibis

the portable Python dataframe library
https://ibis-project.org
Apache License 2.0
4.43k stars 544 forks source link

FEAT: [OmnisciDB] Support insert for OmnisciDB table #2119

Closed YarShev closed 3 years ago

YarShev commented 4 years ago

@xmnlab please, review this PR. Due to some causes, I can't add you into reviewers (but not only you).

YarShev commented 4 years ago

@jreback Can you please review this PR?

YarShev commented 4 years ago

@jreback, just a friendly reminder to review this PR.

YarShev commented 4 years ago

@xmnlab , @jreback , friendly reminder.

YarShev commented 4 years ago

@jreback , please review this PR. I can't add you into reviewers using regular way due to any reasons (not only you).

jreback commented 4 years ago

why are you adding these? these are not supported anywhere else AFAICT

YarShev commented 4 years ago

The following functionality (for omnisci backend) is required to my team: extract data (namely, specified columns or month, year, day from column) from one table and insert it into another table. It can be implemented using insert_into_select operation, that's why I've developed it. And just in case, I've implemented insert_into operation that may be useful for us as well.

xmnlab commented 4 years ago

@YarShev

IMHO I think you should have just one method called insert as used by impala (https://github.com/ibis-project/ibis/blob/master/ibis/impala/client.py#L446)

and you should have as input a dataframe or an ibis expression .. 1) if a dataframe is passed it should work as a SQL insert into, but 2) if the ibis expression is passed it should work as insert into select.

does it makes sense?

YarShev commented 4 years ago

@xmnlab , Thanks for the suggestion, insert_into and insert_into_select can be combined into one method insert. However, I don't think that pandas dataframe should be passed to the insert. Since OmniSci backend supports the insert of single-row values for some columns, I consider the list of values have to passed to the insert. What do you think @xmnlab , @jreback ?

xmnlab commented 4 years ago

@YarShev the current omniscidb.load_data can just load data when the dataframe columns are the same as the columns in the server. So I didn't understand the problem to use dataframe instead of use 2 parameters (a list of values and a list of names) ... also using dataframe will allow you to insert more than one row at once.

YarShev commented 4 years ago

@xmnlab , load_data and insert are different methods. As far as I know, load_data does the following: 1) load data into table if one is empty. 2) overwrites data in the table it one isn't empty.

insert method have to insert values (if dataframe is passed) into table, wherein adding the new records in the table. So, I've implemented new version this PR using dataframe and have combined methods. @xmnlab , @jreback, please review it.

pep8speaks commented 4 years ago

Hello @YarShev! Thanks for updating this PR. We checked the lines you've touched for PEP 8 issues, and found:

There are currently no PEP 8 issues detected in this Pull Request. Cheers! :beers:

Comment last updated at 2020-06-03 05:50:49 UTC
YarShev commented 4 years ago

@jreback , just a friendly reminder.

YarShev commented 4 years ago

@jreback , just a friendly reminder.

YarShev commented 4 years ago

CI is red because no space left on device

xmnlab commented 4 years ago

Have you rebased on top of upstream master?

YarShev commented 4 years ago

Yes, I have pulled the latest master today. Maybe the problem with No space left on device isn't completely resolved.

YarShev commented 4 years ago

@jreback , just a reminder to review this PR, because I see the CI is green again

YarShev commented 4 years ago

@jreback , please, review this PR

datapythonista commented 3 years ago

Closing as stale. @YarShev if you want to continue working on this PR, please ping me, and I'll reopen it. Thanks!