Closed Wcollard closed 4 years ago
The short answer is yes - though it requires a bit of custom code. The high-level approach is:
flattened()
utility to denormalize the structure into a flat tableFlattening the structure before importing is probably not ideal, but that is how it works for now.
Here are some (untested) examples to get you started:
<!-- example.xml -->
<root>
<parent>
<parent_name>Parent 1</parent_name>
<children>
<child>
<child_name>Child 1</child_name>
</child>
<child>
<child_name>Child 2</child_name>
</child>
</children>
</parent>
<parent>
<parent_name>Parent 2</parent_name>
<children>
<child>
<child_name>Child 3</child_name>
</child>
</children>
</parent>
</root>
# myapp/wizard.py
import data_wizard
from data_wizard.loaders import FileLoader
from wq.io import XmlFileIO, BaseIO, TupleMapper, flattened
from .models import Parent, Child, XMLFile
# IO classes & loader for source model
class NestedIO(TupleMapper, BaseIO):
pass
class MyXMLIO(XmlFileIO):
nested = True
def parse_item(self, el):
data = {}
for e in el:
if e.tag == 'children':
val = NestedIO(data=[self.parse_item(c) for c in e])
else:
val = e.text
data[e.tag] = val
return data
class CustomLoader(FileLoader):
default_serializer_class = 'myapp.wizard.ChildSerializer'
def load_io(self):
return flattened(MyXMLIO, filename=self.file.path, inner_attr='children')
data_wizard.set_loader(XMLFile, 'myapp.wizard.CustomLoader')
# Serializers for target models
class ParentSerializer(ModelSerializer):
class Meta:
model = Parent
fields = '__all__'
class ChildSerializer(ModelSerializer):
parent = ParentSerializer()
def create(self, validated_data):
parent_data = validated_data.pop('parent')
parent = ParentSerializer().create(parent_data)
validated_data['parent'] = parent
super().create(validated_data)
class Meta:
model = Child
fields = '__all__'
data_wizard.register("Child with nested Parent", ChildSerializer)
Note that the nesting is inverted on the serializer side because the wizard is looping over each Child record together with the nested parent. This means if you have multiple children for a single parent you will need to customize create()
on the ParentSerializer
to make sure the parent is only created the first time. (Or you can try using NaturalKeySerializer
, which is designed to address this use case).
Closing due to inactivity.
does django Data Wizard work with an xml file that has related tables? That is a schema that when properly imported has a at least two related tables with a one to one or more likely a one to many relationship. Is there a away to properly import all of the xml data?