realm / realm-dart

Realm is a mobile database: a replacement for SQLite & ORMs.
Apache License 2.0
767 stars 87 forks source link

Realm Query export to CSV within the Flutter UI (Windows/macOS) #1076

Open dotjon0 opened 1 year ago

dotjon0 commented 1 year ago

Description

To allow our users to export Realm query data (CSV format) and save to disk (macOS/Windows) within the Flutter UI.

Of course we could:

  1. Execute Realm query for the model
  2. Convert RealmResults into json (dart or ffi)
  3. Convert json to csv (dart or ffi)
  4. etc...

Before doing so, are there any tricks within Realm Dart that will make this easier? Some of the collections will have 200K-300K documents, so memory performance is a key requirement.

How important is this improvement for you?

I'd like to see it, but have a workaround

nielsenko commented 1 year ago

If you are using flexible sync, and hence MongoDB on the backend, I would recommend doing the export from MongoDB and use what ever tool to convert to csv.

If you need to do it from within the app, I would recommend skipping the to json step, and export straight to csv. Just iterating over the result set, writing one line to the csv file per object should be efficient memory wise, as only the last read object needs to be in memory at any given time.

The annoying part is that you need to repeat some trivial code per exported class. Hopefully we can improve on this later.

dotjon0 commented 1 year ago

Yes this is for flexible sync. Its within the UI - thanks for the suggestion of skipping the toJson step, will get the team to give this a whirl. I forgot about how memory efficient Realm Dart is with iterating, thanks for the reminder on this also! Thanks @nielsenko very much! Will leave this ticket to you to decide if you want to keep open or not.

nielsenko commented 1 year ago

Since you do have the data in a MongoDB database, you could consider making an Atlas function that packages and ships the csv, and then call that.

If you still prefer to do it locally, then something along the lines of:

import 'dart:io';

import 'package:realm_dart/realm.dart';

part 'csv_export.g.dart';

@RealmModel()
class _Person {
  @PrimaryKey()
  @MapTo('_id')
  late ObjectId id;
  late String firstName;
  late String lastName;
  late int age;
}

final realm = Realm(Configuration.local([Person.schema]));

void main(List<String> arguments) {
  realm.write(() {
    realm.addAll([
      Person(ObjectId(), 'Lionel', 'Messi', 35),
      Person(ObjectId(), 'Kylian', 'Mbappé', 24),
    ]);
  });

  final output = File('persons.csv').openWrite();
  for (final p in realm.all<Person>()) {
    output.writeln('${p.id};${p.firstName};${p.lastName};${p.age}'); // <-- easy enough but repetitive
  }

  Realm.shutdown();
}

Should work fine, even with millions of records.

dotjon0 commented 1 year ago

Thanks so much @nielsenko, you do continually exceed expectations and are sooo helpful! Its really appreciated!

Is there a way to expose the @MapTo('_id') property key in the output.writeln('${p.id};${p.firstName};${p.lastName};${p.age}'); ? We also need to make an 'import' method (beyond scope of this issue), so keeping the @MapTo('_id') would help a ton. i.e. without having to hard code the @MapTo('_id') for each property.

nielsenko commented 1 year ago

I do export the _id property with ${p.id}. @MapTo just renames it to something that is not private on the dart side, but I guess I might not fully comprehend your question.

With regards to bulk import, I recommend medium sized transaction, fx 1K objects per transaction, and consider moving the work to background isolate.

dotjon0 commented 1 year ago

So what I mean is each CSV column has a header i.e.

final output = File('persons.csv').openWrite(); output.writeln('"_id","first_name","last_name","age"'); // header row to define heading keys for (final p in realm.all()) { output.writeln('${p.id};${p.firstName};${p.lastName};${p.age}'); // <-- easy enough but repetitive }

so is there a way in the 'header row' to instead of:

"_id",

we can do (so this is always in line with RealmObject to reduce maintenance/bugs)

p.id.getKey() // which although the realm key is 'id', it gets the @MapTo key '_id' that is in the RealmObject

Thanks for the tip around importing.

nielsenko commented 1 year ago

Sorry, not yet. I do have a larger refactoring PR that addresses this, but it will not land in its current form.

Anyway for sake of arguments, would an interface like:

  final output = File('persons.csv').openWrite();
  var first = true;
  for (final person in realm.all<Person>()) {
    if (first) {
      output.writeln(person.properties.map((p) => p.name).join(';'));
      first = false;
    }
    output.writeln(person.properties.map((p) => p.value).join(';'));
  }

be useful?

dotjon0 commented 1 year ago

yes this would be very useful (ideally also the @mapto via .mapToName or similar). Although this would output the RealmObject properties in a random order and not in the order as specified in the RealmObject? If only XYZ properties are required in the CSV from the RealmObject, this would of course need more tooling - but the suggested p.firstName.name and p.firstName.mapToName in https://github.com/realm/realm-dart/issues/1077 would help in this instance...?

dotjon0 commented 1 year ago

@nielsenko please also see an indirectly related requirement which touches on the same sort of area https://github.com/realm/realm-dart/issues/1077

nielsenko commented 1 year ago

@dotjon0 The properties would be ordered the same, as specified in the realm object.

See also #1078