graphprotocol / graph-node

Graph Node indexes data from blockchains such as Ethereum and serves it over GraphQL
https://thegraph.com
Apache License 2.0
2.9k stars 963 forks source link

Can't use Infura IPFS with HTTP Basic Auth #3981

Open endersonmaia opened 2 years ago

endersonmaia commented 2 years ago

graph-node fails when I define ipfs address using HTTP Basic Auth in the URL.

What is the current behavior?

Starting with graph-cli:0.33.0, I can add the --headers parameter to deploy a subgraph using Infura's IPFS.

npx graph deploy \
    --product hosted-service \
    --version-label v1.0.0 \
    --headers "{\"Authorization\": \"Basic bXlfdXNlcl9uYW1lOm15X3NlY3JldF9wYXNzd29yZA==\"}" \
    --ipfs https://ipfs.infura.io:5001 my-subgraph subgraph.yaml \
    --node http://localhost:8020

Configuring graph-node with --ipfs=https://bXlfdXNlcl9uYW1l:HIDDEN_PASSWORD@ipfs.infura.io:5001/

But when configuring the graph-node to use Infura's IPFS, I get errors.

Sep 22 19:52:34.954 INFO Graph Node version: 0.27.0 (2022-08-01)
Sep 22 19:52:34.954 WARN GRAPH_POI_ACCESS_TOKEN not set; might leak POIs to the public via GraphQL
Sep 22 19:52:34.954 INFO Reading configuration file `/graph-node/config.toml`
Sep 22 19:52:34.955 WARN No fork base URL specified, subgraph forking is disabled
Sep 22 19:52:34.955 INFO Starting up
Sep 22 19:52:34.955 INFO Trying IPFS node at: https://bXlfdXNlcl9uYW1l:HIDDEN_PASSWORD@ipfs.infura.io:5001/
Sep 22 19:52:34.972 INFO Creating transport, capabilities: , url: https://goerli.infura.io/v3/__REDACTED__, provider: goerli
Sep 22 19:52:35.011 INFO Successfully connected to IPFS node at: https://bXlfdXNlcl9uYW1l:HIDDEN_PASSWORD@ipfs.infura.io:5001/
Sep 22 19:52:35.018 INFO Creating transport, capabilities: , url: https://mainnet.infura.io/v3/__REDACTED__, provider: mainnet
Sep 22 19:52:35.084 INFO Connecting to Postgres, weight: 1, conn_pool_size: 10, url: postgresql://graph-node-indexer-rg6jmQ:HIDDEN_PASSWORD@__REDACTED__/graph-node, pool: main, shard: primary
...
Sep 22 20:07:30.219 INFO Received subgraph_deploy request, params: SubgraphDeployParams { name: SubgraphName("cartesi/pos-goerli"), ipfs_hash: DeploymentHash("QmYX691Wu75US7uEiEohQ18WgFiuRjaa8hqE3PqknJKoHR"), node_id: None, debug_fork: None }, component: JsonRpcServer
Sep 22 20:07:30.903 WARN Trying again after IPFS stat failed (attempt #10) with result Err(reqwest::Error { kind: Status(403), url: Url { scheme: "https", cannot_be_a_base: false, username: "", password: None, host: Some(Domain("ipfs.infura.io")), port: Some(5001), path: "/api/v0/files/stat", query: Some("arg=/ipfs/QmYX691Wu75US7uEiEohQ18WgFiuRjaa8hqE3PqknJKoHR"), fragment: None } }), sgd: 0, subgraph_id: QmYX691Wu75US7uEiEohQ18WgFiuRjaa8hqE3PqknJKoHR, component: SubgraphRegistrar
Sep 22 20:07:31.894 WARN Trying again after IPFS stat failed (attempt #11) with result Err(reqwest::Error { kind: Status(403), url: Url { scheme: "https", cannot_be_a_base: false, username: "", password: None, host: Some(Domain("ipfs.infura.io")), port: Some(5001), path: "/api/v0/files/stat", query: Some("arg=/ipfs/QmYX691Wu75US7uEiEohQ18WgFiuRjaa8hqE3PqknJKoHR"), fragment: None } }), sgd: 0, subgraph_id: QmYX691Wu75US7uEiEohQ18WgFiuRjaa8hqE3PqknJKoHR, component: SubgraphRegistrar
Sep 22 20:07:33.503 WARN Trying again after IPFS stat failed (attempt #12) with result Err(reqwest::Error { kind: Status(403), url: Url { scheme: "https", cannot_be_a_base: false, username: "", password: None, host: Some(Domain("ipfs.infura.io")), port: Some(5001), path: "/api/v0/files/stat", query: Some("arg=/ipfs/QmYX691Wu75US7uEiEohQ18WgFiuRjaa8hqE3PqknJKoHR"), fragment: None } }), sgd: 0, subgraph_id: QmYX691Wu75US7uEiEohQ18WgFiuRjaa8hqE3PqknJKoHR, component: SubgraphRegistrar

If the current behavior is a bug, please provide the steps to reproduce and if possible a minimal demo of the problem.

What is the expected behavior?

graph-node should be able to get Infura IPFS files

graph-node should support HTTP Basic Auth in the URL, like https://[user:pass@]host[:port]

leoyvens commented 2 years ago

Try configuring the header by using a config file https://github.com/graphprotocol/graph-node/blob/master/docs/config.md#configuring-ethereum-providers

endersonmaia commented 2 years ago

@leoyvens that's for ethereum providers, I need this for the IPFS (--ipfs argument)

leoyvens commented 2 years ago

Apologies I misread, I don't believe we have this config for IPFS.

endersonmaia commented 2 years ago

I know, should I consider this a Bug or a Feature Request ?

I can write a better title according to that.

leoyvens commented 2 years ago

I don't think this ever worked so I'd consider it a feature request. PRs welcome though!

agourdon commented 1 year ago

Hello,

As I have the exact same issue with infura, I had a deeper look at this issue.

Basic authentication is actually well supported by reqwest::Client. So there is no issue to use IPFS URI like https://[user:pass@]host[:port].

The HTTP error 403 raised when trying to access the graph is in fact caused by an unsupported infura HTTP API:

Sep 22 20:07:30.903 WARN Trying again after IPFS stat failed (attempt #10) with result Err(reqwest::Error { kind: Status(403), url: Url { scheme: "https", cannot_be_a_base: false, username: "", password: None, host: Some(Domain("ipfs.infura.io")), port: Some(5001), path: "/api/v0/files/stat", query: Some("arg=/ipfs/QmYX691Wu75US7uEiEohQ18WgFiuRjaa8hqE3PqknJKoHR"), fragment: None } }), sgd: 0, subgraph_id: QmYX691Wu75US7uEiEohQ18WgFiuRjaa8hqE3PqknJKoHR, component: SubgraphRegistrar

->

curl -vX POST https://ipfs.infura.io:5001/api/v0/files/stat?arg=/ipfs/QmYX691Wu75US7uEiEohQ18WgFiuRjaa8hqE3PqknJKoHR -u "<user>:<password>"
< HTTP/1.1 403 Forbidden
< Content-Type: text/plain; charset=utf-8
< Vary: Origin
< X-Content-Type-Options: nosniff
< X-Robots-Tag: noindex
< Date: Wed, 30 Nov 2022 12:33:07 GMT
< Content-Length: 26
< 
ipfs method not supported

According to the Infura documentation, there is indeed no way to retrieve the file size using the /api/v0/files/stat API.

I manage to workaround this issue by simply removing this call from graphnode (as this is used only to control file sizes).

Here are my patches FYI (I don't open the PR as the 2nd one is still a wa):

From 9b65e7eab77e3e17d13775bee7c5efbaa48c6160 Mon Sep 17 00:00:00 2001
From: Arnaud Gourdon <arnaud.gourdon13@gmail.com>
Date: Tue, 29 Nov 2022 12:24:32 +0100
Subject: [PATCH 1/2] fix(docker): fixed parsing of the URL in case a
 user/password is provided. This fix is required to support IPFS
 authentication (like infura) and prevent a wrong 2min timeout of the
 graphnode service.

diff --git a/docker/start b/docker/start
index d7729a252..99ff37b53 100755
--- a/docker/start
+++ b/docker/start
@@ -33,11 +33,13 @@ save_coredumps() {
 wait_for_ipfs() {
     # Take the IPFS URL in $1 apart and extract host and port. If no explicit
     # host is given, use 443 for https, and 80 otherwise
-    if [[ "$1" =~ ^((https?)://)?([^:/]+)(:([0-9]+))? ]]
+    if [[ "$1" =~ ^((https?)://)?(([^@]+)@)?([^:/]+)(:([0-9]+))? ]]
     then
         proto=${BASH_REMATCH[2]:-http}
-        host=${BASH_REMATCH[3]}
-        port=${BASH_REMATCH[5]}
+        user=${BASH_REMATCH[4]}
+        host=${BASH_REMATCH[5]}
+        port=${BASH_REMATCH[7]}
+
         if [ -z "$port" ]
         then
             [ "$proto" = "https" ] && port=443 || port=80
-- 
2.25.1
From a9a7f118e64a0c116ea98e2a36830ddac3c8207a Mon Sep 17 00:00:00 2001
From: Arnaud Gourdon <arnaud.gourdon13@gmail.com>
Date: Wed, 30 Nov 2022 11:20:00 +0100
Subject: [PATCH 2/2] fix(ipfs): Workaround to fix graph deployment failure on
 Infura (due to the unsupported /api/v0/files/stat HTTP API).

diff --git a/core/src/link_resolver.rs b/core/src/link_resolver.rs
index 10a065ef0..72293eef0 100644
--- a/core/src/link_resolver.rs
+++ b/core/src/link_resolver.rs
@@ -170,10 +170,10 @@ impl LinkResolverTrait for LinkResolver {
         }
         trace!(logger, "IPFS cache miss"; "hash" => &path);

-        let (size, client) = select_fastest_client_with_stat(
+        let (_size, client) = select_fastest_client_with_stat(
             self.clients.cheap_clone(),
             logger.cheap_clone(),
-            StatApi::Files,
+            StatApi::Block,
             path.clone(),
             self.timeout,
             self.retry,
@@ -182,7 +182,6 @@ impl LinkResolverTrait for LinkResolver {

         let max_cache_file_size = self.env_vars.mappings.max_ipfs_cache_file_size;
         let max_file_size = self.env_vars.mappings.max_ipfs_file_bytes;
-        restrict_file_size(&path, size, max_file_size)?;

         let req_path = path.clone();
         let timeout = self.timeout;
@@ -247,19 +246,16 @@ impl LinkResolverTrait for LinkResolver {
         // Discard the `/ipfs/` prefix (if present) to get the hash.
         let path = link.link.trim_start_matches("/ipfs/");

-        let (size, client) = select_fastest_client_with_stat(
+        let (_size, client) = select_fastest_client_with_stat(
             self.clients.cheap_clone(),
             logger.cheap_clone(),
-            StatApi::Files,
+            StatApi::Block,
             path.to_string(),
             self.timeout,
             self.retry,
         )
         .await?;

-        let max_file_size = self.env_vars.mappings.max_ipfs_map_file_size;
-        restrict_file_size(path, size, max_file_size)?;
-
         let mut stream = client.cat(path, None).await?.fuse().boxed().compat();

         let mut buf = BytesMut::with_capacity(1024);
-- 
2.25.1

Hope this will help.

leoyvens commented 1 year ago

@agourdon thank you for the detailed report. I would like to eventually not rely on files/stat.

Sledro commented 1 year ago

also facing this issue, did anyone discover a solution?

ZacharyShortSecurrency commented 8 months ago

Any updates on this issue?

siddharth9903 commented 3 weeks ago

Any updates? This could be the major helpful to index nfts as it's required vast amount of ipfs data.

leoyvens commented 2 weeks ago

Graph Node does not use files/stat anymore, so if there are still issues on the latest version we would need a more detailed report as the original bug I think is actually resolved.