cdinger / scenic-oracle_adapter

An Oracle adapter for Thoughtbot's scenic gem
MIT License
6 stars 5 forks source link

Add `parallel` option for mviews #16

Closed IanWhitney closed 2 months ago

IanWhitney commented 2 months ago

Adding a custom configuration option for Oracle materialized views, parallel. When set like this

create_view(:mview_name, materialized: {parallel: true})

Then the mview will be created with

CREATE MATERIALIZED VIEW mview_name PARALLEL...

What is parallel?

From Oracle Docs

The parallel_clause lets you parallelize creation of the table and set the default degree of parallelism for queries and the DML INSERT, UPDATE, DELETE, and MERGE after table creation. ... Specify PARALLEL if you want Oracle to select a degree of parallelism equal to the number of CPUs available on all participating instances times the value of the PARALLEL_THREADS_PER_CPU initialization parameter.

In my testing this setting has led to faster mview refresh times. But I wasn't sure about the impact of using it. So I asked Andy Wattenhofer, an expert on Oracle. (emphasis mine)


Me: Is there a reason I wouldn't want to always use it?

Andy: It's not a simple answer, but you would probably always want to use the parallel clause for large datasets. It could cause problems if everyone were to use it at the same time. You're effectively creating several server threads to perform the data load in parallel chunks. It would be possible to saturate one or more of the database servers if it were used too heavily. For that reason I would suggest that you try specifying an integer with it, such as parallel 8 or parallel 16 and fine tune it from there to find the value that maximizes on time for the smallest number of threads...I would request that you be careful in your use of too much parallelism.


In this version I'm not providing the integer paramater (i.e., parallel 8), instead just using parallel. This will always use the max number of CPUS. See the above Oracle doc excerpt.

The suggestion is that this should only be used on large, slow materialized views.

IanWhitney commented 2 months ago

I'm not able to run the test suite. Not sure why. Is the Oracle express container working for Apple chips yet?

For me the test command just hangs on waiting for the db. And Docker shows this status

scenic-oracle_adapter-db-1    container-registry.oracle.com/database/express:latest   "/bin/bash -c $ORACL…"   db        12 minutes ago   Up 12 minutes (unhealthy)   127.0.0.1:1521->1521/tcp, 127.0.0.1:5500->5500/tcp
cdinger commented 2 months ago

I'm not able to run the test suite. Not sure why. Is the Oracle express container working for Apple chips yet?

For me the test command just hangs on waiting for the db. And Docker shows this status

scenic-oracle_adapter-db-1    container-registry.oracle.com/database/express:latest   "/bin/bash -c $ORACL…"   db        12 minutes ago   Up 12 minutes (unhealthy)   127.0.0.1:1521->1521/tcp, 127.0.0.1:5500->5500/tcp

That's exactly the issue. The docker stuff in this repo is all AMD64 still. I do have a multi-arch oracle 19-ee image we've been using though (19-ee is the ONLY version Oracle has that supports ARM64) and I've been meaning to push this somewhere accessible by this project. I'll do that soon!

cdinger commented 2 months ago

@IanWhitney I've updated the docker stuff here to use some new multi-arch images. Can you rebase main onto this and see if you can run the suite? I'll also kick off the CI action here with the existing setup.

IanWhitney commented 2 months ago

~No luck. Still getting hung up on waiting for DB. I'll try to take a look at the Docker setup when I have some time later this week.~

Actually, there was a docker problem on my end. Works now. Tests pass.

cdinger commented 2 months ago

@IanWhitney can you push the rebased version up so we can run the checks here?

IanWhitney commented 2 months ago

I haven't made any changes. I just rebased my branch on top of your latest commit and ran the tests. Looks like CI is failing due to a missing Docker login?

cdinger commented 2 months ago

I've updated the build steps in main. If you force-push your local rebased branch it'll run under the new images/etc. I could fix the docker secrets for this one build but it's not being used anywhere else anymore.

cdinger commented 2 months ago

@IanWhitney there are some string formatting spec errors in the last test run. These are the same errors I was getting locally.

I think this is a good addition though. You won't be able to use the scenic generators with this parallel option, but I think that's OK for now. I'd rather not override their generators at this point. If there are more Oracle-specific options in the future we can look into that. Or propose upstream changes that more easily accommodate adapter-specific options.

IanWhitney commented 2 months ago

You won't be able to use the scenic generators with this parallel option,

You mean that there's no rails g scenic:view blah --parallel? I'm ok with that.

IanWhitney commented 2 months ago

More info on using parallel from another Oracle expert

It does help in some cases, but also has diminishing returns. Low values like 5 can speed things up, but if you go too high (20+ in most cases) it starts to slow down. There's not a hard and fast guideline on when to use it.

My personal recommendation is to try tuning as best you can without parallel. If the performance just is not good enough, test with low values like 5, but consider it a bandaid, and maybe look for better long term solutions (index?)

cdinger commented 2 months ago

Thanks for the contribution, @IanWhitney! I'll get a new release out soon.

cdinger commented 2 months ago

So @IanWhitney I went to try this in my test project and realized that we don't have a way to expose this to scenic's public interface. Their create_view method that's used in ActiveRecord migrations would need to either specify this parallel keyword or pass a hash of options down to the adapters. This is the same issue we discussed with adding support for self-refreshing mviews.

I'm going to revert this PR merge for now. We should think about proposing a scenic change that'd accommodate adapter-specific options like this. Then we could build out both of the features you're asking for.

IanWhitney commented 1 month ago

I may circle back to this at some point, but as a workaround you can just add a parallel hint to the query in the db/views folder

select /*+ PARALLEL(4) */ ...
cdinger commented 1 month ago

Good call. You could also alter it immediately after creating it with something like alter materialized view blahs parallel. But your solution is more self-contained.