mswjs / data

Data modeling and relation library for testing JavaScript applications.
https://npm.im/@mswjs/data
MIT License
823 stars 52 forks source link

Support defining models with factory functions #95

Open smashercosmo opened 3 years ago

smashercosmo commented 3 years ago

Use case: sometimes it's needed to generate field value based on another field value. This could be done if factory functions in model definitions were supported.

Example:

import { random, datatype } from 'faker'
import { factory, primaryKey } from '@mswjs/data'

const db = factory({
  model() {
    const fieldValue = `fieldValue-${datatype.uuid()}`
    const anotherFieldValue = `anotherFieldValue-${fieldValue}`
    return {
      id: primaryKey(datatype.uuid),
      field: () => fieldValue,
      anotherField: () => anotherFieldValue,
    }
  },
})

export { db }
kettanaito commented 3 years ago

Hey, @smashercosmo. Thanks for the suggestion.

I think it's a viable use case, but it requires a proper API on the library's side to allow it.

By design, all keys in the object that you pass to the factory function are model names. We shouldn't change that by introducing special keys like model that can act as factory functions.

One way I can see this done in the current API is to expose the entity to the value getters of a model's property:

factory({
  user: {
    age: () => 15,
    isAdult: (entity) => entity.age >= 18
  }
})

However, there is a semantic dissonance here: you're accessing actual values while defining your model. There may also be relational properties that reference extraneous entities and may not be known at the entity's creation time. I'd be curious to see how other ORM tools approach this use case. Do you happen to know some examples of how other tools do this?

smashercosmo commented 3 years ago

By design, all keys in the object that you pass to the factory function are model names. We shouldn't change that by introducing special keys like model that can act as factory functions.

In my example model is not a special key. It just an entity/model name. Sorry for confusion.

smashercosmo commented 3 years ago

Wondering can this be achieved by using property getter, like

const db = factory({
  get person() {
    const personFieldValue = `personFieldValue-${datatype.uuid()}`
    const anotherPersonFieldValue = `anotherPersonFieldValue-${personFieldValue}`
    return {
      id: primaryKey(datatype.uuid),
      personField: () => personFieldValue,
      anotherPersonField: () => anotherPersonFieldValue,
    }
  },
})
kettanaito commented 3 years ago

In my example model is not a special key. It just an entity/model name. Sorry for confusion.

My bad, I got it wrong. Having a factory function instead of a plain object may not be that bad, but I'd love to experiment with the possible API for this.

What would be the implications if you defined the data outside of the factory?

const personFieldValue = `personFieldValue-${datatype.uuid()}`
const anotherPersonFieldValue = `anotherPersonFieldValue-${personFieldValue}`

const db = factory({
  person: {
    personField: () => personFieldValue,
    anotherPersonField: () => anotherPersonFieldValue,
  }
})

In the example above you don't really rely on the entity values, you create personFieldValue and then derive anotherPersonFieldValue based on it, which can be done outside of the factory() call.

Also, you can create a custom function that encapsulates that person logic and call it, returning an object:

function createPerson() {
  const personFieldValue = `personFieldValue-${datatype.uuid()}`
  const anotherPersonFieldValue = `anotherPersonFieldValue-${personFieldValue}`

  return {
    id: primaryKey(datatype.uuid),
    personField: () => personFieldValue,
    anotherPersonField: () => anotherPersonFieldValue,
  }
}

const db = factory({
  person: createPerson()
})
smashercosmo commented 3 years ago

mmm... I'm maybe missing something but this won't work. Every time we will call db.person.create() we will get same values for personField and anotherPersonField, because they will be kept in closure.

kettanaito commented 3 years ago

You're right, sorry for overlooking that fact.

davidtkramer commented 3 years ago

I like the approach mentioned here: https://github.com/mswjs/data/issues/95#issuecomment-842272584. It's similar to what factory bot does with dependent attributes, although they have the advantage of ruby DSL magic.

I've been working on a lib similar to this (just discovered this week that mswjs/data exists 😄 ) and used a similar approach. The attribute function is passed a Proxy to the entity that can fetch other values. I haven't tackled associations though so not sure how to solve issues with that.

kettanaito commented 3 years ago

Hey, @davidtkramer. Excited to hear you're building a similar tool! We could certainly use your expertise, so feel free to share any areas we can improve in the discussions 🙏 We can join our efforts and produce a superb tool for everybody in the ecosystem to use.

For instance, I wonder how did you solve potentially undefined entity values?

user: {
  a: (entity) => entity.b,
  b: () => 'foo'
}

The model's properties will be iterated in the order of declaration (default object iteration order). This means that when the a value getter will be evaluated, the b (entity.b) property won't yet be set.

kettanaito commented 3 years ago

I also wonder if you imply value synchronization when connecting two model properties this way?

In the example above, what's your expected result when you update the value of the b property? What should user.a equal to? I'd like to avoid having to issue any kind of value subscriptions as that will overcomplicate the internal model handling.

wrex commented 3 years ago

Does two pass evaluation make sense for the factory? One pass to generate an entity where the properties of the passed argument are all empty strings, then a second pass that passes that entity as an argument?

In other words:

user: {
  a: (entity) => 'baz" + entity.b,
  b: () => 'foo'
  c: (entity) => 'bar' + entity.a
}

Would generate

user: {
  a: 'baz',
  b: 'foo',
  c: 'barbaz'
}

It might make more sense to make all properties in the first pass null rather than empty strings to handle other data types, but empty string semantics work best for my use case.

mauriceoc commented 2 years ago

Does two pass evaluation make sense for the factory? One pass to generate an entity where the properties of the passed argument are all empty strings, then a second pass that passes that entity as an argument?

In other words:


user: {
  a: (entity) => 'baz" + entity.b,
  b: () => 'foo'
  c: (entity) => 'bar' + entity.a
}

It might be an idea to give access to the entire modelDefinitions object here, instead of just the current entity. This would allow values to be generated from any entity in the definitions, instead of just the current one.

user: {
     a: (definitions) => 'baz' + definitions.user.b,
     b: () => 'foo',
     c: (definitions) =>  'bar' + definitions.bar.a
}

I guess it also raises the question of whether we would want to generate values from those generated values.

Spitballed API:

{
     user: {
         firstName: faker.firstName,
         lastName: faker.lastName,
         email: {
               dependsOn: ['user.firstName', 'user.lastName']
               value: ({ user }) => faker.exampleEmail(user.firstName, user.lastName);
         }
     },
     bar: {
         hey: () => 'zap'
     },
     foo: {
          something: {
               // can depend on generated email and also another definition outside of this one
               dependsOn: ['user.email', 'bar.hey'],
               value: ({ user, bar }) => faker.doStuff(user.email, bar.hey);
          }
     },
     break: {
           world: {
               dependsOn: ['break.world'],
               value: () => 'oh dear'
           }
     }
}

Not sure about the dot syntax in the dependsOn but it seems sorta intuitive.

kettanaito commented 2 years ago

@wrex, I'm afraid that having to evaluate models twice may introduce performance issues for people with the large number of models.

I can't say I like the idea of mixing models with entities, I think it disrupts the model -> entity flow. I would like to support this feature but without complicating the model syntax. Once a model (property) can access the entity reference, circular references become possible, and, suddenly, we're introducing API like dependsOn. But even the dependency array won't guarantee impossible references as multiple properties may cross-depend on each other.

It's possible to update properties using values of the entire entity via the .update() method:

const db = factory({
  user: { id: primaryKey(String), anotherProperty: string }
})

const user = db.user.create({ id: 'uuid' })

// Update all users (missing "where" propery)
db.user.updateMany({
  data: {
    // Evolve the "anotherProperty" key
    anotherProperty(entity) {
      // to have the next value based on another property "id"
      return `${entity.id}-extra-value`
    }
  }
})

While this is verbose, I like the explicit intent here to derive one property from another.

Maybe property getters would work too.

import { factory, primaryKey, derive } from '@mswjs/data'

const db = factory({
  user: {
    id: primaryKey(String),
    // This is a getter property.
    anotherProperty: derive((user) => `${user.id}-extra-value`),
  }
})

const user = db.user.create({ id: 'foo' })
user.anotherProperty // "foo-extra-value"

On the condition that:

micha149 commented 2 years ago

@kettanaito what about just evaluating the primaryKey in advance and pass it into each value getter. So it could be used as a seed for randomization functions.

Example for an actual use case in my project:

const generateUserData = (id: string): RandomUser => {
    faker.seed(numberHash(id));

    const firstname = faker.name.firstName();
    const lastname = faker.name.lastName();

    return {
        firstname,
        lastname,
        email: faker.internet.email(firstname, lastname),
    };
}

const db = factory({
    user: {
        id: primaryKey(faker.datatype.uuid),
        firstname: ({ id }) => generateUserData(id).firstname,
        lastname: ({ id }) => generateUserData(id).lastname,
        email: ({ id }) => generateUserData(id).email,
        phone: () => faker.phone.phoneNumber('+49 (###) ######-##'),

    },
});

To reduce overhead, the generateUserData function calls could be memoized.