"Payload too large" error when applying a large number of schemas

lylejohnson commented 2 months ago

Describe the bug When I apply migrations I'm getting a "413 Payload Too Large" error for a relatively small migration file. I do however have around 1800 schemas and I'm wondering if there's some interaction there.

To Reproduce

I really can't share the dataset but I guess I'm curious about what's going on under the hood when I run surrealdb-migrations apply. I know that it first "applies" the schemas, and then starts applying the migrations. I have a lot of schemas but none of them are particularly large (largest file size is maybe 4k?). The migration that it chokes on is 1876 bytes in size.

Expected behavior These are small files, they should apply without errors.

Information Please complete the following information and remove the unnecessary ones.

SurrealDB version: 1.5.0 for linux on x86_64
surrealdb-migrations version: 1.5.0

Additional context Add any other context about the problem here.

lylejohnson commented 2 months ago

I removed the migrations from the equation and now it can't even apply the schemas. Going to try turning it off and turning it on again.

lylejohnson commented 2 months ago

Here's the stack trace I obtained by running with RUST_BACKTRACE=full and COLORBT_SHOW_HIDDEN=1:

Error:
   0: There was an error processing a remote HTTP request: HTTP status client error (413 Payload Too Large) for url (http://surreal-db:8000/sql)
   1: There was an error processing a remote HTTP request: HTTP status client error (413 Payload Too Large) for url (http://surreal-db:8000/sql)

Location:
   /usr/local/cargo/registry/src/index.crates.io-6f17d22bba15001f/surrealdb-migrations-1.5.0/src/surrealdb.rs:157

  ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ BACKTRACE ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━

   1: color_eyre::config::EyreHook::into_eyre_hook::{{closure}}::hfe90d364a9394ec8
      at <unknown source file>:<unknown line>
   2: eyre::capture_handler::hc4f19b775069f2a3
      at <unknown source file>:<unknown line>
   3: eyre::error::<impl eyre::Report>::from_std::ha6823c3fb19c183c
      at <unknown source file>:<unknown line>
   4: surrealdb_migrations::surrealdb::apply_in_transaction::{{closure}}::h520c8a1ba1185046
      at <unknown source file>:<unknown line>
   5: surrealdb_migrations::apply::main::{{closure}}::h7038d78af68e0cf6
      at <unknown source file>:<unknown line>
   6: surrealdb_migrations::sub_main::{{closure}}::h9b1648924da1fadc
      at <unknown source file>:<unknown line>
   7: tokio::runtime::park::CachedParkThread::block_on::ha860d55d6556a289
      at <unknown source file>:<unknown line>
   8: tokio::runtime::context::runtime::enter_runtime::h6c6849c9aeebf5a1
      at <unknown source file>:<unknown line>
   9: tokio::runtime::runtime::Runtime::block_on::h4175ba744beb23c4
      at <unknown source file>:<unknown line>
  10: surrealdb_migrations::main::h6bf4f838be9a985c
      at <unknown source file>:<unknown line>
  11: std::sys_common::backtrace::__rust_begin_short_backtrace::hdf592d09d6e5d2ca
      at <unknown source file>:<unknown line>
  12: std::rt::lang_start::{{closure}}::ha23b803fc43fefb4
      at <unknown source file>:<unknown line>
  13: std::rt::lang_start_internal::h103c42a9c4e95084
      at <unknown source file>:<unknown line>
  14: main<unknown>
      at <unknown source file>:<unknown line>
  15: __libc_start_main<unknown>
      at <unknown source file>:<unknown line>
  16: _start<unknown>
      at <unknown source file>:<unknown line>

lylejohnson commented 2 months ago

Looks like it's concatenating all of the schema files into one big blob?

fn extract_schema_definitions(schemas_files: Vec<SurqlFile>) -> String {
    concat_files_content(schemas_files)
}

So that's surely the problem. I'm sure it's faster and more efficient that way when the blob is small enough to fit in a request, but can I request maybe some way to serialize those requests?

jquesada2016 commented 3 weeks ago

I think the reason it's done this way is because SurealDB does not currently support cross-request transactions. So, in order to apply a migration, it tries to do it in a single transaction, which doesn't work because of SurrealDB's query size limit.

lylejohnson commented 3 weeks ago

I think the reason it's done this way is because SurealDB does not currently support cross-request transactions. So, in order to apply a migration, it tries to do it in a single transaction, which doesn't work because of SurrealDB's query size limit.

@jquesada2016 But this is happening when it's loading the schema files (i.e. a bunch of DEFINE TABLE and DEFINE FIELD statements). Do we really need to worry about transactions for these kinds of operations? (I know that in my case, I don't use any transactions in those schema definitions).

jquesada2016 commented 3 weeks ago

I think it's because if you don't do it in a single transaction, and you have an error in one of your schema files or SurrealDB throws an error, then you would end up in an inconsistant state. Transactions offer a simple mechanism for easily rolling back in case something goes awry.

Odonno / surrealdb-migrations

"Payload too large" error when applying a large number of schemas #87