Closed jrandrews closed 9 years ago
Also a list of output from pd.show_versions()
INSTALLED VERSIONS
------------------
commit: None
python: 2.7.10.final.0
python-bits: 64
OS: Darwin
OS-release: 15.0.0
machine: x86_64
processor: i386
byteorder: little
LC_ALL: None
LANG: None
pandas: 0.16.2
nose: 1.3.7
Cython: None
numpy: 1.8.0rc1
scipy: 0.13.0b1
statsmodels: 0.6.1
IPython: None
sphinx: None
patsy: 0.4.0
dateutil: 1.5
pytz: 2013.7
bottleneck: None
tables: None
numexpr: None
matplotlib: 1.3.1
openpyxl: None
xlrd: None
xlwt: None
xlsxwriter: None
lxml: None
bs4: None
html5lib: None
httplib2: 0.9.1
apiclient: None
sqlalchemy: 1.0.8
pymysql: None
psycopg2: None
Thanks for the report, an you try some of these things:
engine.has_table
work?meta = sqlalchemy.MetaData(bind=engine); meta.reflect()
) Does it recognize tables?engine.has_table
seems to work:
from sqlalchemy import *
orcl_connect_string = 'oracle+cx_oracle://system:oracle@192.168.56.101:1521/?service_name=orcl'
engine = create_engine(orcl_connect_string)
connection = engine.connect()
if engine.has_table(table_name='INT_PROP'):
print 'Yes, has the table'
else:
print 'No table'
Results: Yes, has the table
I tried to test for metadata reflection. I am sorry, I might not understand exactly what you are asking for -- here is the code that I ran:
from sqlalchemy import *
orcl_connect_string = 'oracle+cx_oracle://system:oracle@192.168.56.101:1521/?service_name=orcl'
engine = create_engine(orcl_connect_string)
meta = MetaData(bind=engine)
print meta.reflect()
and the results:
None
I tried this too and what is interesting is that I didn't seem to get a full list of tables that this user has access to:
for t in meta.sorted_tables:
print t.name
Results:
logmnr_attrcol$
logmnr_attribute$
logmnr_ccol$
logmnr_cdef$
logmnr_col$
logmnr_coltype$
logmnr_con$
logmnr_container$
logmnr_dictionary$
logmnr_dictstate$
logmnr_enc$
logmnr_icol$
logmnr_ind$
logmnr_indcompart$
logmnr_indpart$
logmnr_indsubpart$
logmnr_kopm$
logmnr_lob$
logmnr_lobfrag$
logmnr_logmnr_buildlog
logmnr_ntab$
logmnr_obj$
logmnr_opqtype$
logmnr_partobj$
logmnr_props$
logmnr_refcon$
logmnr_seed$
logmnr_subcoltype$
logmnr_tab$
logmnr_tabcompart$
logmnr_tabpart$
logmnr_tabsubpart$
logmnr_ts$
logmnr_type$
logmnr_user$
logmnrc_con_gg
logmnrc_concol_gg
logmnrc_gsba
logmnrc_gsii
logmnrc_gtcs
logmnrc_gtlo
logmnrc_ind_gg
logmnrc_indcol_gg
logmnrc_seq_gg
logmnrp_ctas_part_map
logstdby$apply_progress
movie1_stg
movie2_stg
What is interesting is that the last tables in that list are defined as external tables in Oracle. Those have some funny properties compared to normal tables. Has anybody ever tested using external tables in Oracle with SQLAlchemy or Pandas? I wasn't ever trying to select from the external tables, but I wonder if their presence in the metadata is already causing issues.
Well, read_sql_table
internally uses this reflected meta object to get the table. So we should investigate why it is not included there (meta.tables
), while engine.has_table
does work.
You can also give echo=True
to the engine creation (create_engine(..., echo=True)), to get some more debugging information on what is happening on the sqlalchemy calls.
So I think I have found the issue after setting echo=True for create_engine(). When SQLAlchemy queries the Oracle system catalog it filters out tables belonging to two users, specifically SYSTEM and SYSAUX. Here is the actual query:
SELECT table_name FROM all_tables WHERE nvl(tablespace_name, 'no tablespace') NOT IN ('SYSTEM', 'SYSAUX') AND OWNER = :owner AND IOT_NAME IS NULL AND DURATION IS NULL
Now, I am connecting as the SYSTEM user because this is running in a local virtual machine on my laptop which has no connectivity to the outside world. In any sort of real scenario this would be a bad idea, but in my scenario it isn't important. I copied the table to another user's schema and then ran the python script logging on as that user. read_sql_table now works. So it's an issue with SQLAlchemy silently hiding any tables belonging to SYSTEM or SYSAUX. I realize that this isn't a pandas problem. I looked at SQLAlchemy documentation and there is no mention in the Oracle dialect of SQLAlchemy doing this.
I can see why filtering out SYSTEM and SYSAUX tables on Oracle might be useful in some cases because there are a lot of them and they aren't helpful to the average user, but there should probably be a note or an indication and an option in the engine or connection constructor to allow for turning this behavior off. I can see certain scenarios where I might specifically want to access data in those tables.
OK, good you found it! I am closing this issue, as it is not something we can do anything about, but you can certainly raise the issue at sqlalchemy if you want (although I suppose they will have a good reason for it, but at least they can mention it in the docs).
Hello!
It's seems that I faced with the same error...
I found that you could avoid "NOT IN ('SYSTEM', 'SYSAUX')" statement:
sqlalchemy_engine = sqlalchemy.create_engine(sqlalchemy_con_str, case_sensitive=False, echo=True, encoding='latin1', **exclude_tablespaces=None**)
And it's working as intended:
2017-02-20 20:49:34,833 INFO sqlalchemy.engine.base.Engine SELECT table_name FROM all_tables WHERE OWNER = :owner AND IOT_NAME IS NULL AND DURATION IS NULL
But in my case this option did not solve the issue:
Traceback (most recent call last): File "/usr/local/lib/python3.5/dist-packages/pandas/io/sql.py", line 257, in read_sql_table meta.reflect(only=[table_name], views=True) File "/usr/local/lib/python3.5/dist-packages/sqlalchemy/sql/schema.py", line 3815, in reflect (bind.engine, s, ', '.join(missing))) sqlalchemy.exc.InvalidRequestError: Could not reflect: requested table(s) not available in Engine(oracle+cx_oracle://opc_op:***@ovm-hpomdb.sdab.sn:1521/?service_name=hpom): (OPC_ACT_MESSAGES)
During handling of the above exception, another exception occurred:
Traceback (most recent call last): File "/home/vetal/PycharmProjects/hpom_srv_stat/fetch_from_db.py", line 37, in
p_df = pd.read_sql_table(str(table), sqlalchemy_engine).assign(CSVTimeStamp=current_tstamp) File "/usr/local/lib/python3.5/dist-packages/pandas/io/sql.py", line 259, in read_sql_table raise ValueError("Table %s not found" % table_name) ValueError: Table OPC_ACT_MESSAGES not found
Could anyone help me in that?
I had the same issue. Now solved. Try table name in lower case. Thanks to Basel Issmail from SO.
Thank you, @ChuliangXiao. Your solution worked for me.
Using read_sql_table I think there is a bug because it always makes the reflect call with 'views=True', this cause an exception 'NotImplementedError'.
(sql.py lin.269):
try:
meta.reflect(only=[table_name], **views=True**)
schema.py lin.4671 returns method not implemented
(if views available.update(insp.get_view_names(schema))
Please, Do you know any way of fixing it? Thks!
PD: Ref. to the code, the method doesn't provide the views parameter
def read_sql_table( table_name, con, schema=None, index_col=None, coerce_float=True, parse_dates=None, columns=None, chunksize: Optional[int] = None, ) -> Union[DataFrame, Iterator[DataFrame]]:
Pandas is very helpful and I have been using it heavily recently. However, I have run into an issue where this code works:
but this code fails:
Error message I get back:
I can't think of any other extenuating circumstances -- it's a table, not a view, the userid logging in has full rights to the table in question. I've also tried adding the schema name and I get the same error back.
Versions:
Problem occurs whether running from within Pycharm CE or running at the terminal console with python.