pyinat / pyinaturalist

Python client for iNaturalist
https://pyinaturalist.readthedocs.io
MIT License
130 stars 17 forks source link

GUANOMetadata Support in Audio files #543

Closed arky closed 6 months ago

arky commented 7 months ago

Request helper function to extracting GUANO metadata from audio hosted on iNaturalist or on local computer. This would help uploading of bat recordings and also analysis of recordings that are currently hosted on iNaturalist.

GUANO Metadata format (https://www.wildlifeacoustics.com/SCHEMA/GUANO.html) is used to store metadata about audio recordings similar to EXIF data in image files. The GUANO metadata is written to WAV files by hardware Bat recorders from various vendors.

You can access GUANO metadata wav from this project (https://www.inaturalist.org/projects/citibats-cambodia) for testing.

Guano metadata parsing is provided by Guano python module http://pypi.org/project/guano

Use case

1 : Uploading audio recordings to iNaturalist

Workarounds

Is there an existing workaround to accomplish this?

JWCook commented 7 months ago

This is a really cool idea! I'm a little hesitant to work on something as taxon-specific as this, but in general this is very close to some of my own interests in using media metadata as a way to work with observations offline, and then sync data between a local machine and iNat. I've been working on an app that does something similar with image metadata (specifically Darwin Core embedded in XMP): https://github.com/pyinat/naturtag

I may be able to help with at least some of what you described. I'll think about it some more and get back to you.

arky commented 7 months ago

This is a really cool idea! I'm a little hesitant to work on something as taxon-specific as this, but in general this is very close to some of my own interests in using media metadata as a way to work with observations offline, and then sync data between a local machine and iNat. I've been working on an app that does something similar with image metadata (specifically Darwin Core embedded in XMP): https://github.com/pyinat/naturtag

I may be able to help with at least some of what you described. I'll think about it some more and get back to you.

@JWCook Naturtag seems like an interesting project. Here is a use case that came up few days ago. My friend is not tech savy programmer. He has large number of audio recordings (500 bat calls) that he would like to upload to iNaturalist. He would like install a GUI application on Windows/Mac machine.

  1. Open the naturtag application by clicking on a ICON
  2. Enter username/password of iNat or API key
  3. Point the application to the source folder of bat calls and provide default taxa (usually 'bats'), default project name (Citibats Cambodia)
  4. Extract date/time, GPS location from each .wav files guano metadata and upload it to iNat

Hope this would give you further ideas.

JWCook commented 7 months ago

I spent a bit of time looking at the GUANO spec and trying out guano-py with samples from your project, and here are my notes so far.

GUANO -> iNat

GUANO attributes that can be directly mapped to iNat observations:

Usage

As for exposing GUANO metadata as list, it looks like guano-py already has a GuanoFile object that acts as a dict-like interface, for example:

from guano import GuanoFile

g = GuanoFile('test.wav')
print(g.items())
print(g['Note'])
g['Note'] = 'New note'
g.write()

To turn that into a list, you could do something like:

guano_list = [f'{k}: {v}' for k, v in g.items()]

Would that be sufficient, or is there a different way you want that to be presented?

Observation fields

Something else that might be useful (also mentioned in this issue) is making use of iNat's observation fields. Those could allow you to make additional GUANO attributes visible and searchable on inaturalist.org. Here are a few that could be mapped to existing observation fields:

Those are the most obvious ones, but there are probably more. And you can create new observation fields if needed (although it's always good to check if a similar one already exists). Let me know if this is something you'd be interested in.

iNat -> GUANO

If needed, all of the mappings above would be easy enough to do in reverse (after downloading an obs + audio from iNat, as in your second use case described above).

Let's say you have GPS coords from your wav file, you create a new observation with that, and then you update the coords on iNat (say, to correct for known GPS drift). Would you want to write that updated info back to the GUANO metadata?

And are there any other iNat attributes besides the ones mentioned above that you might want to write back to GUANO metadata (if present) before re-uploading?

arky commented 6 months ago

Thank you @JWCook for the insights. Am going to close this bug for now.