Closed fprasx closed 5 months ago
Nice catch! If that code is reachable, then yeah let's make it return an error rather than panicking. If there's no suitable error variant already (one whose error message matches what what went wrong here), then feel free to create a new one.
Sounds good #frictionlog
EDIT: changed to enum variant
Looks good! Could you add a test for it as well?
Just drop a schema file (.graphql
extension) that has this issue (and is otherwise valid) in test_data/tests/schema_errors
, and the test automation should pick it up. You'll then need to add a corresponding .schema-error.ron
file that contains the error, or you can create it based on what the test automation suggests is needed (after verifying it for correctness).
Running into a little issue, which is that if a type of a field on the root query type is unknown, the error is emitted twice, once in check_root_query_type_invariants
, and once in check_type_and_property_and_edge_invariants
.
(yes testing reviewed a new place this error can happen 😅)
Do you think we should deduplicate errors? Or maybe just leave it when checking the root type, and only emit it when checking all edges + properties?
I'll push what I have so you can see.
Okie, added the tests. Stumbled upon the issue of errors being a different order so I also added a method to DisplayVec
. I went for the quadratic approach because a) there probably won't be a prohibitively slow number of errors and b) imposing Hash/Ord
on errors seems unreasonable (and might be impossible due to wrapping async_graphql_parser
errors). For this reason, we also can't use a hashset or btreeset.
Stumbled upon the issue of errors being a different order
Do they appear in a non-deterministic order? If so, that's a bug that we have to fix — we really, really want Trustfall to be a deterministic system because otherwise it'll become impossible to debug. For example, this is also why we always use BTreeMap / BTreeSet
instead of their hashed equivalents.
I'm not a fan of the new method because it allows non-determinism to sneak in. Can you try removing the set equality method and then see if it's enough to flip the lines in the expected test output file, or if there's a deeper determinism issue?
Thanks for powering through this!
Yup, it's because of a HashMap nondeterminism (trustfall_core/src/schema/mod.rs:331
):
fn check_type_and_property_and_edge_invariants(
query_type_definition: &TypeDefinition,
vertex_types: &HashMap<Arc<str>, TypeDefinition>,
) -> Result<(), Vec<InvalidSchemaError>> {
let mut errors: Vec<InvalidSchemaError> = vec![];
for (type_name, type_defn) in vertex_types {
All of the type-checking methods in this file take HashMap
s, and here's the definition of the Schema
struct:
#[derive(Debug, Clone)]
pub struct Schema {
pub(crate) schema: SchemaDefinition,
pub(crate) query_type: ObjectType,
pub(crate) directives: HashMap<Arc<str>, DirectiveDefinition>,
pub(crate) scalars: HashMap<Arc<str>, TypeDefinition>,
pub(crate) vertex_types: HashMap<Arc<str>, TypeDefinition>,
pub(crate) fields: HashMap<(Arc<str>, Arc<str>), FieldDefinition>,
pub(crate) field_origins: BTreeMap<(Arc<str>, Arc<str>), FieldOrigin>,
}
I can do a larger refactor converting these to their B-tree equivalents.
Let's approach it a different way: we can fix the nondeterminism by accessing the elements in a predictable order, instead of iterating over the hashmap.
For the schema type, we care a bit less about schema parsing/validation performance since that happens only once in the app's lifecycle, and we care a lot about query type-checking performance because that happens once per query. HashMap is faster than BTreeMap past a certain minimum size (and large schemas are a thing), so it's better to keep the fields on Schema
as hashmaps and do a sort on the vertex type names as part of validating schemas.
Sorry this is a bit complicated!
All good. Interesting to think through.
A little sad though that I can't just %s/HashMap/BTreeMap/g
. . . because that basically just worked 🤣
Rust is like that! Things tend to just work, and it will spoil you rotten! It's hard to switch back to another language afterward 😅
Very true. No good deed goes unpunished haha. I'm going to make a new PR for the deterministic iteration order, and then I'll rebase this one on to of that if that's ok. I think it'll be cleaner.
Sounds good to me!
Kind of saw this coming so I was able to just revert the commit adding the set stuff and rebase :)
Looks great!
Should this actually be a new error variant?