g-andrade / locus

MMDB reader for geolocation and ASN lookup of IP addresses
https://hexdocs.pm/locus/
MIT License
111 stars 15 forks source link

Ability to pass a function as a database source #28

Closed sneako closed 3 years ago

sneako commented 3 years ago

Hi there!

We are interested in building our own geolocation db and hosting the mmbd file in S3.

We would like to make this S3 object private and our application already has its own AWS credentials, which it can use to access the private S3 object. I realize that I could write some code to use these credentials and download the mmdb to the local filesystem, but this application runs in kubernetes and we would rather not introduce volumes if it is not necessary.

Therefore, I am proposing locus be extended to allow an {M,F,A} tuple to be passed as the db source. The function could return any of the types accepted by locus:start_loader/2.

I would be happy to implement this functionality if you agree that it is probably the best way to handle this kind of scenario.

Thanks for reading!

g-andrade commented 3 years ago

Hello,

That sounds like an excellent idea!

And thanks for offering to contribute. However, there is some nuance in these particular internals (ahem, someone might even call it overengineering) - stuff like caching on the local filesystem, retry policy and event production (which ultimately drives logging) - and I'd prefer to look into it myself, at least for now.

That being said: I've covered the basics already but haven't yet pushed (it's still missing tests and changes to documentation.)

This is the behaviour I've currently got:

-callback describe_source(Args) -> {remote, Description} | {local, Description}
        when Args :: term(),
             Description :: term().

-callback fetch(Args, PreviouslyModifiedOn) -> {fetched, Success}
                                               | dismissed
                                               | {error, Reason}
        when Args :: term(),
             PreviouslyModifiedOn :: calendar:datetime() | undefined,
             Success :: success(),
             Reason :: term().

-type success() ::
    #{ format := locus_loader:blob_format(), % tgz | mmdb | ... | unknown
       content := binary(),
       modified_on => calendar:datetime()
     }.

As for the return value of fetch/2:

I believe this should cover your use case.

I may look into it on the weekend (this library is a hobby of mine), but no promises.

sneako commented 3 years ago

Wow, that looks like it should definitely cover my use case. Thank you!

g-andrade commented 3 years ago

Released in 2.0.0 (also published to Hex.)

Beware that it includes a few breaking changes, all of them documented in MIGRATION.md, the most noticeable of which being the return value of locus:lookup/2 - when no data is found - having switched to not_found instead of {error, not_found}.

g-andrade commented 3 years ago

The behaviour to implement is locus_custom_fetcher.

The fetcher can then be used by passing it to locus:start_loader/2,3} or locus:loader_child_spec{2,3,4} under the database edition, as {custom_fetcher, Module, Args}.