application-research / autoretrieve

A server to make GraphSync data accessible on IPFS
22 stars 7 forks source link

Support indexer endpoints #8

Closed hannahhoward closed 2 years ago

hannahhoward commented 2 years ago

Goals

Make autoretrieve a more flexible bridge by allowing it to talk to regular context indexers rather than only estuary's find retrieval candidates endpoint

Implementation

Extract the actual fetching of retrieval candidates from inside of retrieval to an "endpoint" interface. Implement a version for Estuary and a version for content indexers following StoreTheIndex's API protocol (see https://github.com/filecoin-project/storetheindex)

The biggest challenge is in coercing data to match between results -- Estuary returns miner addresses, while indexers return peer.AddrInfo. This ends up producing a fair number of changes:

For Discussion

hannahhoward commented 2 years ago

tag @willscott cause he's not in the organization as a reviewer yet

elijaharita commented 2 years ago

tested, unfortunately hit a segfault

estuary@bitswap-gateway-1:~/autoretrieve$ GOLOG_FILE="logs/indexer.txt" GOLOG_LOG_LEVEL="filclient=debug" go run .  --endpoint-type indexer
2022-03-01T21:26:51.334Z        INFO    autoretrieve    metrics/basic.go:28     Using default wallet address f3qib5lfkzrmtkwex6tciuq3nnelsz6dcfyp5a2ufpcucu4p42bvg3kvoaxtbbrsexoepndn6xq6szywtwb4gq
panic: runtime error: invalid memory address or nil pointer dereference
[signal SIGSEGV: segmentation violation code=0x1 addr=0x8 pc=0x165e02e]

goroutine 15509 [running]:
github.com/application-research/autoretrieve/endpoint.(*IndexerEndpoint).FindCandidates(0xc0006d4000, {0x33d9c28, 0xc000052080}, {{0xc006f377d0, 0x64246c}})
        /home/estuary/autoretrieve/endpoint/indexer.go:34 +0xce
github.com/application-research/autoretrieve/filecoin.(*Retriever).lookupCandidates(0xc000fe24e0, {0x33d9c28, 0xc000052080}, {{0xc006f377d0, 0xc000612ff0}})
        /home/estuary/autoretrieve/filecoin/retriever.go:322 +0x57
github.com/application-research/autoretrieve/filecoin.(*Retriever).Request(0xc000fe24e0, {{0xc006f377d0, 0xc000052080}})
        /home/estuary/autoretrieve/filecoin/retriever.go:116 +0x5c
github.com/application-research/autoretrieve/bitswap.(*Provider).ReceiveMessage(0xc0007a01e0, {0x33d9c28, 0xc000052080}, {0xc0004bcf90, 0x4a1e4ff}, {0x34278f8, 0xc000612ff0})
        /home/estuary/autoretrieve/bitswap/provider.go:198 +0x3a6
github.com/ipfs/go-bitswap/network.(*impl).handleNewStream(0xc000288f00, {0x34178f8, 0xc0039e0ca0})
        /home/estuary/go/pkg/mod/github.com/ipfs/go-bitswap@v0.5.1/network/ipfs_impl.go:423 +0x37e
github.com/libp2p/go-libp2p/p2p/host/basic.(*BasicHost).SetStreamHandler.func1({0xc0011fc300, 0x13}, {0x7faec0381990, 0xc0039e0ca0})
        /home/estuary/go/pkg/mod/github.com/libp2p/go-libp2p@v0.18.0-rc3/p2p/host/basic/basic_host.go:573 +0x76
created by github.com/libp2p/go-libp2p/p2p/host/basic.(*BasicHost).newStreamHandler
        /home/estuary/go/pkg/mod/github.com/libp2p/go-libp2p@v0.18.0-rc3/p2p/host/basic/basic_host.go:416 +0x7a5
exit status 2
elijaharita commented 2 years ago

tested, unfortunately hit a segfault

estuary@bitswap-gateway-1:~/autoretrieve$ GOLOG_FILE="logs/indexer.txt" GOLOG_LOG_LEVEL="filclient=debug" go run .  --endpoint-type indexer
2022-03-01T21:26:51.334Z        INFO    autoretrieve    metrics/basic.go:28     Using default wallet address f3qib5lfkzrmtkwex6tciuq3nnelsz6dcfyp5a2ufpcucu4p42bvg3kvoaxtbbrsexoepndn6xq6szywtwb4gq
panic: runtime error: invalid memory address or nil pointer dereference
[signal SIGSEGV: segmentation violation code=0x1 addr=0x8 pc=0x165e02e]

goroutine 15509 [running]:
github.com/application-research/autoretrieve/endpoint.(*IndexerEndpoint).FindCandidates(0xc0006d4000, {0x33d9c28, 0xc000052080}, {{0xc006f377d0, 0x64246c}})
        /home/estuary/autoretrieve/endpoint/indexer.go:34 +0xce
github.com/application-research/autoretrieve/filecoin.(*Retriever).lookupCandidates(0xc000fe24e0, {0x33d9c28, 0xc000052080}, {{0xc006f377d0, 0xc000612ff0}})
        /home/estuary/autoretrieve/filecoin/retriever.go:322 +0x57
github.com/application-research/autoretrieve/filecoin.(*Retriever).Request(0xc000fe24e0, {{0xc006f377d0, 0xc000052080}})
        /home/estuary/autoretrieve/filecoin/retriever.go:116 +0x5c
github.com/application-research/autoretrieve/bitswap.(*Provider).ReceiveMessage(0xc0007a01e0, {0x33d9c28, 0xc000052080}, {0xc0004bcf90, 0x4a1e4ff}, {0x34278f8, 0xc000612ff0})
        /home/estuary/autoretrieve/bitswap/provider.go:198 +0x3a6
github.com/ipfs/go-bitswap/network.(*impl).handleNewStream(0xc000288f00, {0x34178f8, 0xc0039e0ca0})
        /home/estuary/go/pkg/mod/github.com/ipfs/go-bitswap@v0.5.1/network/ipfs_impl.go:423 +0x37e
github.com/libp2p/go-libp2p/p2p/host/basic.(*BasicHost).SetStreamHandler.func1({0xc0011fc300, 0x13}, {0x7faec0381990, 0xc0039e0ca0})
        /home/estuary/go/pkg/mod/github.com/libp2p/go-libp2p@v0.18.0-rc3/p2p/host/basic/basic_host.go:573 +0x76
created by github.com/libp2p/go-libp2p/p2p/host/basic.(*BasicHost).newStreamHandler
        /home/estuary/go/pkg/mod/github.com/libp2p/go-libp2p@v0.18.0-rc3/p2p/host/basic/basic_host.go:416 +0x7a5
exit status 2
hannahhoward commented 2 years ago

@elijaharita sorry about that I believe I fixed it.

elijaharita commented 2 years ago

new issue

estuary@bitswap-gateway-1:~/autoretrieve$ GOLOG_FILE="logs/indexer.txt" GOLOG_LOG_LEVEL="filclient=debug" go run . --per-miner-retrieval-limit 1 --whitelist f010446 --timeout 10m --fullrt --endpoint-type indexer
go: downloading github.com/filecoin-project/ffi-stub v0.3.0
2022-03-03T19:57:23.609Z        INFO    autoretrieve    metrics/basic.go:28     Using default wallet address f3aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaagazosta
2022-03-03T19:57:24.519Z        ERROR   autoretrieve    metrics/basic.go:34     Could not get candidates: batch find query failed: Bad Request
2022-03-03T19:57:24.568Z        ERROR   autoretrieve    metrics/basic.go:34     Could not get candidates: batch find query failed: Bad Request
2022-03-03T19:57:24.574Z        ERROR   autoretrieve    metrics/basic.go:34     Could not get candidates: batch find query failed: Bad Request
2022-03-03T19:57:24.648Z        ERROR   autoretrieve    metrics/basic.go:34     Could not get candidates: batch find query failed: Bad Request
2022-03-03T19:57:24.648Z        ERROR   autoretrieve    metrics/basic.go:34     Could not get candidates: batch find query failed: Bad Request
2022-03-03T19:57:24.653Z        ERROR   autoretrieve    metrics/basic.go:34     Could not get candidates: batch find query failed: Bad Request
2022-03-03T19:57:24.730Z        ERROR   autoretrieve    metrics/basic.go:34     Could not get candidates: batch find query failed: Bad Request
2022-03-03T19:57:24.731Z        ERROR   autoretrieve    metrics/basic.go:34     Could not get candidates: batch find query failed: Bad Request
...

fyi - i totally don't mind testing, but if you're interested you should be able to run autoretrieve out of the box yourself with no problem. i use GOLOG_LOG_LEVEL="filclient=debug" go run . --per-miner-retrieval-limit 1 --whitelist f0minerhere --timeout 10m --fullrt --endpoint-type indexer. just keep in mind that whatever miner you use will get any retrieval that gets started stuck in their queue. whitelisting 1 miner and setting the retrieval limit to 1 with high timeout usually keeps started retrievals at 1 per session.

other issue - with the recent commit autoretrieve is now reporting my wallet address as f3aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaagazosta, which is not correct. we've actually encountered this issue before when doing ffi stuff, i don't remember the specific cause.

whyrusleeping commented 2 years ago

The ffi stub doesnt have a proper implementation of the bls signing library, we either need to continue using the ffi code, or (better) integrate a pure golang version of the bls code (which I think exists, cc @ribasushi )

ribasushi commented 2 years ago

@elijaharita @whyrusleeping: you are in luck: @jsign's awesomeness got you covered: https://pkg.go.dev/github.com/jsign/go-filsigner#section-readme

You could drop ffi entirely

hannahhoward commented 2 years ago

@elijaharita @whyrusleeping sorry about the ffi-stub -- I just could not get my freakin build to compiled on the amazon box we've setup and was giving up... I didn't consider the wallet implications. I'll put it back and figure out how to solve my own problem.

@elijaharita the good news is we've tried with the indexer ourselves and had success. The issue here is you need to give it the right endpoint -- unless Estuary has an indexer like endpoint already -- add --endpoint https://cid.contact too your command.

hannahhoward commented 2 years ago

@ribasushi that lib looks great but I think it would need to get incorporated in other places? not sure.

ribasushi commented 2 years ago

@hannahhoward point is you can just make it part of the stub itself

hannahhoward commented 2 years ago

@ribasushi almost surely a thing that can be done, not something I am personally solving right this second.

elijaharita commented 2 years ago

ohhkay got it, didn't realize endpoint needed to be set for indexer mode. seems to be working now! thanks for making this happen!