Open ehofesmann opened 4 years ago
I was able to create an index on 5000 samples in 0.1 seconds with this:
import fiftyone as fo
import fiftyone.core.odm as foo
conn = foo.get_db_conn()
collection = conn[dataset._doc.sample_collection_name]
collection.create_index("yolov4.ground_truth_eval.true_positives.0_75")
This is probably the way to go fixing this issue.
Yes, yes, and yes to indexes!!!
Describe the problem
When trying to sort a large dataset (>100MB) MongoDB will throw an error:
OperationFailure: Sort exceeded memory limit of 104857600 bytes, but did not opt in to external sorting
This came up when running the evaluate detections tutorial on 5000 samples in coco-2017.
One way around this is to set
allowDiskUse: True
for the sorting pipeline stage but this will likely cause the sorting to take a long time. https://docs.mongodb.com/manual/reference/operator/aggregation/sort/#sort-and-memory-restrictionsAnother, possibly faster way, could be to create a MongoDB index on whatever field you are sorting by first and then sorting by that to avoid the memory requirements. In the evaluate detections tutorial, the error arose when sorting by the newly created field
tp_iou_0_75
, but I could still sort byfilepath
most likely because an index exists for it.Code to reproduce issue
https://voxel51.com/docs/fiftyone/tutorials/evaluate_detections.html
Other info / logs
Include any logs or source code that would be helpful to diagnose the problem. If including tracebacks, please include the full traceback. Large logs and files should be attached.
What areas of FiftyOne does this bug affect?
App
: FiftyOne application issueCore
: Corefiftyone
Python library issueServer
: Fiftyone server issue