Open tc19901016 opened 2 weeks ago
Use Arc<Mutex<Vec<Row>>>
instead. If you have no need to wrap Vec
with Arc
, use Vec<Row>
instead of Arc<Vec<Row>>
.
In order to transfer type across threads, Send
must be implemented. Both Arc<Mutex<Vec<Row>>>
and Vec<Row>
implement Send
but Arc<Vec<Row>>
doesn't. That's because Arc<T>
implements Send
when T
implements both Send
and Sync
but Vec<Row>
doesn't implements Sync
.
Thank you. however, I have another question.
I want to convert Vec
can I update this line by using Arc::new....? thank you.
You cannot. Though Sync
must be implemented to share references between threads, Row
doesn't.
Well, to be precise, you can wrap Row
in Mutex
to share references between threads but using it with threading will be slower than simple code without threading.
Though I'm not sure what is your case, it is better to use query_as
instead of query
in general. See https://github.com/kubo/rust-oracle/blob/master/docs/query-methods.md#with-and-without-_as
Another option is https://quietboil.github.io/sibyl/. It uses Oracle Call Interface directly. On the other hand rust-oracle uses the interface via ODPI-C. I'm not sure which is faster.
I tested millions rows data in oracle by using jdbc and rust-oracle.In different data type, some queries show rust is faster and the other show jdbc is faster. the record is below:
1) 30 columns(NUMBER(28.18)) * 5670,0000 rows
Rust-oracle:row.next()+row.get()(sec) | Rust-oracle:Just row.next()(sec) | jdbc:rs.next()+rs.getFloat()(sec) | jdbc:rs.next()(sec) -- | -- | -- | -- 44 | 26 | 21 | 10.92) 30 columns(NUMBER(10.0)) * 5670,0000 rows
Rust-oracle:row.next()+row.get()(sec) | Rust-oracle:Just row.next()(sec) | jdbc:rs.next()+rs.getInt()(sec) | jdbc:rs.next()(sec) -- | -- | -- | -- 24 | 21 | 20 | 10.43) 30 columns(Date) * 5670,0000 rows
Rust-oracle:row.next()+row.get()(sec) | Rust-oracle:Just row.next()(sec) | jdbc:rs.next()+rs.getDate()(sec) | jdbc:rs.next()(sec) -- | -- | -- | -- 25 | 23 | 82 | 13my boss believe there must be a way that rust read all type data is faster than jdbc. So, I must explain: 1) row.next() is slower than jdbc. (I need to test more, I think this is faster than jdbc but in fact, it is not.) 2) number is slower than jdbc. maybe other type(like varchar) is also slower, I will test tomorrow
all sql like: select T_DATE AS A0,T_DATE AS A1,T_DATE AS A2,T_DATE AS A3,T_DATE AS A4,T_DATE AS A5,T_DATE AS A6,T_DATE AS A7,T_DATE AS A8,T_DATE AS A9,T_DATE AS A10,T_DATE AS A11,T_DATE AS A12,T_DATE AS A13,T_DATE AS A14,T_DATE AS A15,T_DATE AS A16,T_DATE AS A17,T_DATE AS A18,T_DATE AS A19,T_DATE AS A20,T_DATE AS A21,T_DATE AS A22,T_DATE AS A23,T_DATE AS A24,T_DATE AS A25,T_DATE AS A26,T_DATE AS A27,T_DATE AS A28,T_DATE AS A29 from tableA
test case is below: 1) Rust-oracle:row.next()+row.get()
fn oracle_test_iter(sql:&str) ->std::result::Result<(), Box<dyn Error>>{
println!("oracle_test_iter_col sql:{}", sql);
let manager = OracleConnectionManager::new("xxx", "xxxx", "1.1.1.1:1521/xxxx");
let pool = r2d2::Pool::builder().max_size(1).build(manager).unwrap();
let conn = pool.get().unwrap();
for _ in 0..4{
let start_time = Local::now();
// let col_len = conn.query(sql, &[]).unwrap().column_info().len();
let mut stmt = conn.statement(sql).prefetch_rows(1024).fetch_array_size(1024).build().unwrap();
let rows = stmt.query(&[]).unwrap();
let col_len = rows.column_info().len();
let mut count=0;
for row in rows{
let r = row?;
for i in 0..col_len {
let x = r.get::<_,String>(i); //according column data type to update: get::<_, Timestamp>, get::<_, i64>, get::<_,f64>
}
}
let duration = Local::now()-start_time;
println!("duration:{}", duration);
}
Ok(())
}
2) Rust-oracle:row.next()
fn oracle_test_only_read(sql:&str) ->std::result::Result<(), Box<dyn Error>>{
println!("oracle_test_only_read sql:{}", sql);
let manager = OracleConnectionManager::new("xxx", "xxxx", "1.1.1.1:1521/xxxx");
let pool = r2d2::Pool::builder().max_size(1).build(manager).unwrap();
let conn = pool.get().unwrap();
for _ in 0..10{
let start_time = Local::now();
// let col_len = conn.query(sql, &[]).unwrap().column_info().len();
let mut stmt = conn.statement(sql).prefetch_rows(1024).fetch_array_size(1024).build().unwrap();
let rows = stmt.query(&[]).unwrap();
let col_len = rows.column_info().len();
let mut count=0;
for row in rows{
// let r = row?;
// for i in 0..col_len {
// let x = r.get::<_,String>(i);
// }
// count+=1;
}
let duration = Local::now()-start_time;
println!("duraction:{}", duration);
}
println!();
Ok(())
}
3) jdbc: rs.next()+rs.getFloat()
private static void test_iter(String sql, int testNum, boolean onlineFlag) throws Exception{
Connection conn = null;
if(onlineFlag){
conn = DriverManager.getConnection("jdbc:oracle:thin:@1.1.1.1:1521/xxx", "xxx", "xxx");
}else{
conn = DriverManager.getConnection("jdbc:oracle:thin:@1.1.1.1:1521/xxx", "xxx", "xxx");
}
System.out.println("test_iter"+sql);
List<List<Object>> list = new ArrayList<>();
for(int k=0;k<testNum;k++){
long timestamp = System.currentTimeMillis();
try (PreparedStatement ps = conn.prepareStatement(sql)) {
ps.setFetchSize(1024);
try (ResultSet rs = ps.executeQuery()) {
ResultSetMetaData metaData = rs.getMetaData();
int columnCount = metaData.getColumnCount();
while (rs.next()) {
for (int i = 0; i < columnCount; i++) {
float columnValue = rs.getFloat(i+1);//rs.getInt;rs.getDate
}
}
}
} catch (SQLException e) {
e.printStackTrace();
}
long duration = System.currentTimeMillis() - timestamp;
System.out.println("Finished:"+duration);
}
}
4)jdbc: rs.next()
private static void test_only_read(String sql, int testNum, boolean onlineFlag) throws Exception{
Connection conn = null;
if(onlineFlag){
conn = DriverManager.getConnection("jdbc:oracle:thin:@1.1.1.1:1521/xxx", "xxx", "xxx");
}else{
conn = DriverManager.getConnection("jdbc:oracle:thin:@1.1.1.1:1521/xxx", "xxx", "xxx");
}
System.out.println("test_only_read"+sql);
List<List<Object>> list = new ArrayList<>();
for(int k=0;k<testNum;k++){
long timestamp = System.currentTimeMillis();
try (PreparedStatement ps = conn.prepareStatement(sql)) {
ps.setFetchSize(1024);
try (ResultSet rs = ps.executeQuery()) {
ResultSetMetaData metaData = rs.getMetaData();
int columnCount = metaData.getColumnCount();
while (rs.next()) {
// for (int i = 0; i < columnCount; i++) {
// String columnValue = rs.getString(i+1);
// }
}
}
} catch (SQLException e) {
e.printStackTrace();
}
long duration = System.currentTimeMillis() - timestamp;
System.out.println("Finished:"+duration);
}
}
In the flame graph of version 0.6.1-dev in https://github.com/kubo/rust-oracle/issues/44#issuecomment-2185927331, upirtrc
, which is an internal function inside of an Oracle client library, takes 47.77% of samples. Could you make flame graph of your code? I guess that more than half of time is consumed in the Oracle library in the case of "Rust-oracle:Just row.next()"
As for the difference of seconds between NUMBER(28.18) and NUMBER(10.0), I guess that is caused by here.
When the column definition is NUMBER(10.0), the column values are converted from NUMBER to int64 inside of Oracle Call Interface (OCI). row.get<_, i64>
just copies the int64 value.
On the other hand when the definition is NUMBER(28.18), the values are passed through the OCI, formatted as string representation in ODPI-C and then converted to requested type such f64
inside of rust-oracle. In the current implementation, the string values owned by ODPI-C are converted to f64
via String
. I may improve it by using &str
referring to the buffer in OCPI-C instead of temporal String
owning data copied from the buffer.
@tc19901016 Did you try slightly bigger arraysize values? For such a large number of rows I would have left prefetch size at its default.
@kubo ping us if you need help, or to nudge us about requests like https://github.com/oracle/odpi/issues/172
@kubo You are right. 1)In rust-readonly test(test case 2), dpiStmt_fetchRows cost 56%, RowValue::get and drop cost the others. I think that's why rust-readonly test slower than jdbc(maybe jdbc only run like stmt.next())
2)In rust-itercol test(test case 1), Number(28,18) increase get_string_unchecked cost than Number(10). f64 flamegraph: i64 flamegraph: Though flamegraph is created, I have no idea to increase reading speed. I will try to modify this project. if I have some results, I will write a comment .
My rust program has a oracle-read thread and a consume thread. 1)oracle-read thread use [ResultSet.next] to get Row and send a Arc<Vec> to tokio::sync::mpcs::unbounded queue.
2)consume thread receive Arc<Vec> and do something.
but Rust compiler outut some error:*`mut oracle::binding::binding::dpiVector`** cannot be shared between threads safely
thank you
the oracle-read code is below:
the consume thread code is below:
queue code:let (tx, mut rx):(UnboundedSender<(Arc<Vec>, i32, i32)>, UnboundedReceiver<(Arc<Vec>, i32, i32)>)=mpsc::unbounded_channel();
note: required by a bound in tokio::spawn --> C:\Users\tianchuan01.cargo\registry\src\mirrors.ustc.edu.cn-12df342d903acd47\tokio-1.40.0\src\task\spawn.rs:167:21 | 165 | pub fn spawn(future: F) -> JoinHandle
| ----- required by a bound in this function
166 | where
167 | F: Future + Send + 'static,
| ^^^^ required by this bound in spawn