Let's say we have two csv files and two mappings: one for companies (company_num, name, address) and one for directors (director_id,name,company_num) - a standart situation for a mysql export from diffirent tables.
When generating entities from them, we can't link director to company - simply because we can't get company entity_id from directors mapping (and vice versa).
The only way for now - is to create two entities: one 'full' from companies file (having all company fields - name, address, etc) and one 'stub' from directors file (containing only company_num since it's only one available there) - and then merge them outside Beast.
My proposal is to create a resolver to generate and return entity_id without generating entity.
E.g., for companies file we will have:
# just a simple Company entity
entities:
company:
schema: Company
keys:
- entity.registrationNumber
# lets assume we will get a key '1234abcd' from this input
properties:
registrationNumber:
column: company_num
name:
column: name
address:
column: address
And for directors we will have:
entities:
person:
schema: Person
keys:
- entity.name
properties:
name:
column: name
# ... and other person's fields we have
directorship:
schema: Directorship
keys:
- entity.organization
- entity.director
properties:
director: # we can link director by entity resolver
entity: person
organization: # but we can't link an organization - because we creating it's entity in other file/mapping
our_custom_entity_id_resolver: # instead we will use our resolver
- column.company_num # and give it list of columns to generate entity_id
# this will create entity_id for without creating an entity,
# having same column/value as in company file will yield us same id '1234abcd'
# thus allowing us to link the entity from other file
I'm not sure whether this is a good idea, need you thoughts @dchaplinsky
Generate entity fragment from the directors file of type Company, which will only have an entity id. It'll generate a one statement and the key hashed the same way as the company from the company table which you can reference from the person. So for the same company you'll have two entities (one is very shallow, key only) and the second is the full one (and their keys will be the same as long as you following the same pattern for the surrogate key).
Option 1 but with the flag for that fragmented entity, like virtual=True. So you can use that entity to reference to the company but it won't be exported.
Let's say we have two csv files and two mappings: one for companies (
company_num
,name
,address
) and one for directors (director_id
,name
,company_num
) - a standart situation for a mysql export from diffirent tables.When generating entities from them, we can't link director to company - simply because we can't get company entity_id from directors mapping (and vice versa).
The only way for now - is to create two entities: one 'full' from companies file (having all company fields - name, address, etc) and one 'stub' from directors file (containing only
company_num
since it's only one available there) - and then merge them outside Beast.My proposal is to create a resolver to generate and return
entity_id
without generatingentity
.E.g., for companies file we will have:
And for directors we will have:
I'm not sure whether this is a good idea, need you thoughts @dchaplinsky