tnc-ca-geo / animl-base

Application deployed on field computers to integreate Buckeye X80 wireless camera traps with Animl
Other
4 stars 0 forks source link

Figure out image naming convention, hashing, and directory structure for S3 image repository #7

Closed nathanielrindlaub closed 4 years ago

nathanielrindlaub commented 4 years ago

I think @postfalk mentioned wanting to use a hash function on all incoming images spread out the images across multiple objects on S3. @postfalk, would you mind confirming that's what you were suggesting and letting me know if you have specific recommendations for hash functions and object/directory structure?

In terms of the naming conventions, remember that at this stage in the pipeline (image received on Raspberry Pi, image uploaded to S3), we haven't extracted EXIF data yet, so it might not make sense for use to rename the images at all when we store them in S3. We also have limited values we could use as inputs to throw at a hash function.

In terms of directory structure, eventually we will have images coming in from Buckeye cameras, cellular cameras, manually off SD cards, etc., and I'm not sure if we gain anything by trying to provide some sort of structure within the S3 bucket vs. treating all images coming from any source the same. Perhaps we could partition the S3 bucket into "Projects" (e.g., SCI might be one project that a certain group of users have access to, Dye Creek another)? Althought I suppose this is all kind of dependent on where we land with the backend data model design.

I would love any ideas anyone has on the S3 image management piece of this.