protomaps / go-pmtiles

Single-file executable tool for working with PMTiles archives
BSD 3-Clause "New" or "Revised" License
350 stars 48 forks source link

Add support for uploading to Arweave #163

Closed pescennius closed 4 months ago

pescennius commented 4 months ago

Rationale

Scope of Changes

Reproducing

bdon commented 4 months ago

Can you find a different solution for uploading? I'm not comfortable adding this many additional dependencies as well as a platform that other users haven't requested.

pescennius commented 4 months ago

@bdon

That being said, I do believe there is a compromise that might be able to address all of this. Instead of bundling this into the pmtiles CLI. I just have this PR add the ability to use plugin based implementations, so that I can maintain the arweave functionality myself as a plugin I can call at runtime. The actual code change here would look something like:

import (
  "fmt"
  "slices"
  "net/url"
  "plugin"
)

type Uploader interface  {
  Upload(*log.Logger, string, string, string, int) error
}

// Upload a pmtiles archive to a bucket.
func Upload(logger *log.Logger, input string, bucket string, key string, pluginPath string, maxConcurrency int) error {
    parsedUri, err := url.Parse(bucket)
    if err != nil {
        return fmt.Errorf("unable to parse bucket uri: %w", err)
    }

    if slices.Contains(GcloudSupportedSchemes, parsedUri.Scheme) {
          return uploadWithGoCloud(logger, input, parsedUri, key, maxConcurrency)
        } else if pluginPath != "" {
          return uploadWithPlugin(logger, input, parsedUri, key, maxConcurrency)         
        } else {
           return fmt.Errorf("scheme %q is not supported by default and no plugin was provided. Please try again with a supported scheme or a plugin that supportes %q", parsedUri.Scheme, parsedUri.Scheme)
        }

        return nil
}

// uploadWithPluginexpects loads a go plugin 
func uploadWithPlugin(logger *log.Logger, input string, bucket string, key string, pluginPath string, maxConcurrency int) error {
        p, err := plugin.Open(pluginPath)
    if err != nil {
        return fmt.Errorf("unable to open plugin: %w", err)
    }

       symbol, err := p.Lookup("Uploader")
       if err != nil {
           return fmt.Errorf("plugin does not expose an Uploader interface: %w", err)
      }

      uploader, ok := symbol.(Uploader)
      if !ok {
         return fmt.Errorf("Uploader found in plugin does not implement Uploader interface")
      }

      return uploader.Upload(logger, input, bucket, key, maxConcurrency

}

and the functionality for arweave could be called like

pmtiles upload  ~/path_to_test_pmtile name_for_file_on_arweave --plugin="some/path/to/arweave.so" --bucket="arweave://arweave.net/path_to/ardrive-demo-wallet.json?foo bar1"

The benefits of this are:

bdon commented 4 months ago

Thanks for the details. I don't see upload as a core feature of this CLI, because it doesn't currently do anything specific to PMTiles. The set of providers is intentionally limited to those provided by gocloud, plus filesystem/unauthenticated HTTP, to minimize long term maintenance + code ownership and capture a large number (but not all) use cases.

What advantage would adding this have over using any other CLI for uploading files to Arweave? PMTiles is intentionally just a normal file, so any existing tools for that purpose should work.

pescennius commented 4 months ago

I guess I don't understand what the hesitation is around my suggested compromise around plugins? It seems pretty in line with the goals you stated. It won't add dependencies, its not a breaking change, and it won't add anything for you to maintain because any actual plugins will be maintained by their developers. I'm also willing to write this because its quite simple.

The advantage of these things being directly in the CLI is it makes it easier to onboard people into using PMTiles when the tooling supports the backends. A lot of people using this might be front end focused developers who aren't super comfortable chaining together things on the command line. I have students where this is would be helpful for example. At the end of the day its your call and your repo, I'll just live with a fork if I have to. I guess I can also upstream this to gocloud but then `go-pmtiles would have to eat the dependencies.

thisisaaronland commented 4 months ago

FWIW, one option would be to define the pmtiles tool as a Go-language interface with a default implementation that targets the existing bucket sources.

That would provide the ability for bespoke implementations targeting Arweave or other providers that could exist outside the core package and be imported on an as-needed basis for individual users. This would also involve a potentially "non-trivial" refactoring of the go-pmtiles code base so... maybe not?

The other alternative would be to implement the gocloud.dev/blob interfaces for Arweave. Because of the way that go-pmtiles/main.go is structured this would still require cloning the file in order to import this new package but it's only 237 lines so... maybe?

pescennius commented 4 months ago

FWIW, one option would be to define the pmtiles tool as a Go-language interface with a default implementation that targets the existing bucket sources.

@thisisaaronland that's essentially what I'm suggesting by using the plugin package. It just makes it so that those other implementations of the interface can be compiled separately than go-pmtiles and loaded at runtime as plugins. That takes you guys out of the loop for maintaining any implementations people think they need and keeps the dependencies on go-pmtiles light.

thisisaaronland commented 4 months ago

I was unaware of the plugin system. I can't speak for @bdon but I imagine the lack of Windows support, not to mention all the other caveats around plugins, will be a show-stopper:

It does feel like the best approach would be to implement the gocloud.dev/blob interfaces for Arweave (which may be cold comfort since even I haven't been able to get around to doing the same for other services...)

pescennius commented 4 months ago

@thisisaaronland they don't work on Windows but AFAIK that wouldn't prevent go-pmtiles from working as-is on windows. It just means that windows users wouldn't have access to plugins. Upstreaming to gocloud.dev/blob, is doable, but I was trying to be mindful of @bdon desire to minimize the weight of dependencies. If I upstream to gocloud.dev/blob, then go-pmtiles is going to eat those dependencies.

bdon commented 4 months ago

I guess I don't understand what the hesitation is around my suggested compromise around plugins? It seems pretty in line with the goals you stated. It won't add dependencies, its not a breaking change, and it won't add anything for you to maintain because any actual plugins will be maintained by their developers. I'm also willing to write this because its quite simple.

Just because it is easy to add does not mean we should add it. The design goal of this program is to be easy for a solo developer to maintain over several years. Adding a plugin system might take 5 minutes, but maintaining that additional surface area over years across other changes in the program could accumulate to days of work of volunteer labor that I need to do in order to accommodate that single use case.

I suggest you can maintain a fork highlighting the additional platform and if a significant audience needs those platforms we can evaluate it later, thanks.

pescennius commented 4 months ago

Just because it is easy to add does not mean we should add it. The design goal of this program is to be easy for a solo developer to maintain over several years.

I gave you an implementation that is quite easy for anyone to maintain, given the skillset necessary to maintain the existing code. the surface area was quite small(under 20 lines of code) and isolated to just the Upload() function such that "maintaining that additional surface area over years across other changes in the program could accumulate to days of work of volunteer labor" is not a really reasonable expectation. I'm happy to maintain the fork separately, I'm not entitled to you accepting this change. However, let's be honest with ourselves that there wasn't a technical reason this is too difficult to maintain. 😅 its okay to say you aren't interested enough in the use case to deal with it, it just doesn't help anyone to conjure technical justifications to decisions that weren't made for technical reasons.