IntelLabs / vdms

VDMS: Your Favorite Visual Data Management System
MIT License
84 stars 31 forks source link

Images stored in a single directory? #89

Closed prashastk closed 3 months ago

prashastk commented 5 years ago

It looks like all images go into a single directory db/images/png. That will create filesystem performance issues when the number of images increases (more details in https://serverfault.com/questions/796665/what-are-the-performance-implications-for-millions-of-files-in-a-modern-file-sys)

There is also a theoretical limit on the number of files that a filesystem can contain. I believe ext4 by default allows a max of 2^32 files. I would expect someone will eventually hit that. But then, there may be other performance issues of storing 4B+ images in one VDMS instance.

I would recommend:

  1. Create a sub-dir structure to store the files.
  2. Consider creating bigger binary blob files containing multiple images. Reference images by keeping track of starting index and length.
vishakha041 commented 5 years ago

We have talked in the past about it but did not have performance data from a large enough dataset that pointed to this shortcoming. But thank you for the link. We will keep this in mind so we know what to look for.

ifadams commented 3 months ago

@cwlacewe this was solved by sub-dividing the storage directory for local storage. That approach has introduced some other problems, but for the time being this is a solved issue so closing.