This takes the results of the app_hash fetching and proceeds in getting the corresponding IPFS content.
While fetching it separates it into two pieces found and not_found. I ran this once and it took 58 minutes for 2310 records (this is our "seed data") - attached here:
Output
Among 2310 total app hashes, only 40 we "not found" (and I suspect they were Null). You can find them all in the attached zip file.
Here is a snippet of the logs (in case we want to add something) -- like "content successfully written to... since it appears to be missing"
Logs From Single Run
```
2022-11-17 15:42:00,637 INFO dune_client.file creating write path /dune-bridge/data
2022-11-17 15:42:00,638 WARNING pysrc.fetch.dune block range file sync_block.csv not found, using genesis block 12153262
2022-11-17 15:42:00,638 DEBUG pysrc.fetch.dune Executing Query(query_id=1615490, name='Latest Possible App Hash Block', params=None)
2022-11-17 15:42:00,960 INFO dune_client.base_client waiting for query execution 01GJ30M7GNR48JTXFR44241G8P to complete: ExecutionState.EXECUTING
2022-11-17 15:42:11,243 DEBUG pysrc.fetch.dune Got 1 results for execution 01GJ30M7GNR48JTXFR44241G8P
2022-11-17 15:42:11,243 DEBUG pysrc.fetch.dune Executing Query(query_id=1610025, name='Unique App Hashes', params=[, ])
2022-11-17 15:42:11,562 INFO dune_client.base_client waiting for query execution 01GJ30MHV0R9E6J5094VKZXPBP to complete: ExecutionState.EXECUTING
2022-11-17 15:42:22,019 DEBUG pysrc.fetch.dune Got 2310 results for execution 01GJ30MHV0R9E6J5094VKZXPBP
2022-11-17 15:42:23,866 DEBUG __main__ Found content for 0xd68a5c78d920efc8e8c4785baadb0d9c64d21ea88c22592631d6135fdba6fa94 at CID bafybeigwrjohrwja57eorrdylovnwdm4mtjb5kemejmsmmowcnp5xjx2sq
2022-11-17 15:42:24,778 DEBUG __main__ Found content for 0x2757c919984771aaf288c738f6507886b9e218baeb199bbe6639c707742868ee at CID bafybeibhk7ertgchogvpfcghhd3fa6egxhrbroxldgn34zrzy4dxikdi5y
2022-11-17 15:42:25,597 DEBUG __main__ Found content for 0xc80a7dcd584a25427ae0cddc53558076c156331a29e666a45cd6cef9867239cd at CID bafybeigibj642wckevbhvygn3rjvladwyfldggrj4ztkixgwz34ym4rzzu
2022-11-17 15:46:36,757 DEBUG __main__ No content found for 0x1487f733547707805dc8bb4b738f97b098bd793d8065934efdf9eb45de5f45e1 at CID bafybeiauq73tgvdxa6af3sf3jnzy7f5qtc6xspmamwju57pz5nc54x2f4e after 3 retries
2022-11-17 15:46:31,966 DEBUG __main__ Found content for 0xe24b3b0af075daa388fd153b526c2fc8204a00d5671fb95fdbc3484d82bf4e65 at CID bafybeihcjm5qv4dv3kryr7ivhnjgyl6iebfabvlhd64v7w6djbgyfp2omu
2022-11-17 15:46:32,954 DEBUG __main__ Found content for 0xa687e0633ad964efaac51f6ca6064bf23447beb2238dc89eae7c98283ce3cf2f at CID bafybeifgq7qggowzmtx2vri7nstams7sgrd35mrdrxej5lt4taudzy6pf4
...
```
Follow ups
There are a few nice "todos" left in this code, but I think we might actually be ready with this as is.
To summarize some follow up tasks:
[x] append missing records to an already existing missing_content file (so that when we decide to try again they are all in one place. This will need to be implemented in dune-client side. Append can also be used to write as content is found instead of storing all results in memory. This is not a HUGE issue, because this seed data will make up for 99% of the long running scripts. All future runs are expected to complete in seconds. This is already a recorded issue in dune-client: https://github.com/cowprotocol/dune-client/issues/37
[ ] consider recording the number of timeouts and possibly removing this "max retries" loop on the content fetching. I sincerely suspect that all found app content was found on the first attempt and all not found content does not exist.
I am going to merge this as is for now and follow up with the logic that picks up missing records on the next run and writes empty content after "giveUp" is exceeded.
This takes the results of the app_hash fetching and proceeds in getting the corresponding IPFS content.
While fetching it separates it into two pieces
found
andnot_found
. I ran this once and it took 58 minutes for 2310 records (this is our "seed data") - attached here:Output
Among 2310 total app hashes, only 40 we "not found" (and I suspect they were Null). You can find them all in the attached zip file.
data.zip -- Full content ~172Kb
Logs
Here is a snippet of the logs (in case we want to add something) -- like "content successfully written to... since it appears to be missing"
Logs From Single Run
``` 2022-11-17 15:42:00,637 INFO dune_client.file creating write path /dune-bridge/data 2022-11-17 15:42:00,638 WARNING pysrc.fetch.dune block range file sync_block.csv not found, using genesis block 12153262 2022-11-17 15:42:00,638 DEBUG pysrc.fetch.dune Executing Query(query_id=1615490, name='Latest Possible App Hash Block', params=None) 2022-11-17 15:42:00,960 INFO dune_client.base_client waiting for query execution 01GJ30M7GNR48JTXFR44241G8P to complete: ExecutionState.EXECUTING 2022-11-17 15:42:11,243 DEBUG pysrc.fetch.dune Got 1 results for execution 01GJ30M7GNR48JTXFR44241G8P 2022-11-17 15:42:11,243 DEBUG pysrc.fetch.dune Executing Query(query_id=1610025, name='Unique App Hashes', params=[Follow ups
There are a few nice "todos" left in this code, but I think we might actually be ready with this as is.
To summarize some follow up tasks:
[x] append missing records to an already existing
missing_content
file (so that when we decide to try again they are all in one place. This will need to be implemented in dune-client side. Append can also be used to write as content is found instead of storing all results in memory. This is not a HUGE issue, because this seed data will make up for 99% of the long running scripts. All future runs are expected to complete in seconds. This is already a recorded issue in dune-client: https://github.com/cowprotocol/dune-client/issues/37[ ] consider recording the number of timeouts and possibly removing this "max retries" loop on the content fetching. I sincerely suspect that all found app content was found on the first attempt and all not found content does not exist.
[ ] Post Results to Dune!