r4fek / django-cassandra-engine

Django Cassandra Engine - the Cassandra backend for Django
BSD 2-Clause "Simplified" License
364 stars 85 forks source link

Batch Save using Dataframe #124

Open oneandonlyonebutyou opened 5 years ago

oneandonlyonebutyou commented 5 years ago

So let's assume I get 1000s items in the form of JSON and trying to verify the format and save them all .

I did the following :

        df = pd.DataFrame(request.data)
        df["device_uuid"] = device_uuid
        df["serializer"] = None
        df["serializer"] = df.apply(
            lambda row: RawLogSerializer(data=row.to_dict()), axis=1
        )
        df.apply(
            lambda row: row["serializer"].is_valid(raise_exception=True), axis=1
        )
        logs = df.apply(lambda row: row["serializer"].save(), axis=1).tolist()

Then I am saving one by one which I know it is wrong (not optimal ) ...

        [l.save() for l in logs]

How would I should do this correctly?

I asked this question before and the response was :

  with BatchQuery() as b:
       YourModel.batch(b).create(...)

I do not know How it would look like in my example...

Thanks again