Open djpirra opened 1 year ago
As this issue is presented, I'm not sure yet if this is actually an issue with pandas or one of the other services you're using( apache spark or sqlalchemy).
Can you provide us with more details like what the DataFrame looks like that you're trying to insert? Your repro script is a little vague and I don't have oracle so I can't test this locally. I have a feeling this is more of an issue with oracle+oracledb
than pandas but I need more info to confirm or deny that.
Hi @djpirra. I encountered a similar issue with the Snowflake database and, after some research, was able to resolve it. Could you please try running the following code change and see if it resolves the issue on your end?
df = pd.read_csv(<csv file>, header=0)
newColumnNames = [col.upper() for col in df.columns]
df.columns = newColumnNames
engine = create_engine(
f'oracle+oracledb://:@',
connect_args={
"user": adb_username,
"password": adb_password,
"dsn": connStr
})
dtyp = {c:types.VARCHAR(200) for c in df.columns.tolist()}
df.to_sql(name=table_name.lower(), schema=adb_username, con=engine, if_exists=if_exists, dtype = dtyp, index=False, method='multi')
In my environment, I noticed that to_sql raises issues when the table name is uppercased in the parameters, but when executed, the table name appears uppercased in the database.
This behavior seems to affect multiple databases, as discussed in the following references:
Let me know if this works or if further adjustments are needed!
Pandas version checks
[X] I have checked that this issue has not already been reported.
[X] I have confirmed this bug exists on the latest version of pandas.
[ ] I have confirmed this bug exists on the main branch of pandas.
Reproducible Example
Issue Description
sqlalchemy.exc.InvalidRequestError: Could not reflect: requested table(s) not available in Engine(oracle+oracledb://:***@): (TABLE_NAME) at org.apache.spark.api.python.BasePythonRunner$ReaderIterator.handlePythonException(PythonRunner.scala:555) at org.apache.spark.api.python.PythonRunner$$anon$3.read(PythonRunner.scala:713) at org.apache.spark.api.python.PythonRunner$$anon$3.read(PythonRunner.scala:695) at org.apache.spark.api.python.BasePythonRunner$ReaderIterator.hasNext(PythonRunner.scala:508) at org.apache.spark.InterruptibleIterator.hasNext(InterruptibleIterator.scala:37) at scala.collection.Iterator.foreach(Iterator.scala:943) at scala.collection.Iterator.foreach$(Iterator.scala:943) at org.apache.spark.InterruptibleIterator.foreach(InterruptibleIterator.scala:28) at scala.collection.generic.Growable.$plus$plus$eq(Growable.scala:62) at scala.collection.generic.Growable.$plus$plus$eq$(Growable.scala:53) at scala.collection.mutable.ArrayBuffer.$plus$plus$eq(ArrayBuffer.scala:105) at scala.collection.mutable.ArrayBuffer.$plus$plus$eq(ArrayBuffer.scala:49) at scala.collection.TraversableOnce.to(TraversableOnce.scala:366) at scala.collection.TraversableOnce.to$(TraversableOnce.scala:364) at org.apache.spark.InterruptibleIterator.to(InterruptibleIterator.scala:28) at scala.collection.TraversableOnce.toBuffer(TraversableOnce.scala:358) at scala.collection.TraversableOnce.toBuffer$(TraversableOnce.scala:358) at org.apache.spark.InterruptibleIterator.toBuffer(InterruptibleIterator.scala:28) at scala.collection.TraversableOnce.toArray(TraversableOnce.scala:345) at scala.collection.TraversableOnce.toArray$(TraversableOnce.scala:339) at org.apache.spark.InterruptibleIterator.toArray(InterruptibleIterator.scala:28) at org.apache.spark.rdd.RDD.$anonfun$collect$2(RDD.scala:1030) at org.apache.spark.SparkContext.$anonfun$runJob$5(SparkContext.scala:2254) at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:90) at org.apache.spark.scheduler.Task.run(Task.scala:131) at org.apache.spark.executor.Executor$TaskRunner.$anonfun$run$3(Executor.scala:506) at org.apache.spark.util.Utils$.tryWithSafeFinally(Utils.scala:1462) at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:509) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) ... 1 more
Expected Behavior
The objective is to write to a table where its name and column names are all in upper case. The table is created successfully but it fails at data insert. It would be expected for it to complete the operation successfully.
Installed Versions