fidlabs / Open-Data-Pathway

6 stars 8 forks source link

[DataCap Application] web3eye.io #6

Closed kikakkz closed 3 months ago

kikakkz commented 8 months ago

Version

1

DataCap Applicant

kikakkz

Project ID

1

Data Owner Name

web3eye.io

Data Owner Country/Region

Hong Kong

Data Owner Industry

Web3 / Crypto

Website

https://testnet.web3eye.io

Social Media Handle

@web3_eye

Social Media Type

Slack

What is your role related to the dataset

Dataset Owner

Total amount of DataCap being requested

1

Unit for total amount of DataCap being requested

PiB

Expected size of single dataset (one copy)

512

Unit for expected size of single dataset

TiB

Number of replicas to store

4

Weekly allocation of DataCap requested

50

Unit for weekly allocation of DataCap requested

TiB

On-chain address for first allocation

f1pvixpahnuxfjp73ojo43oax5asoopobyqu4hmea

Data Type of Application

Public, Open Commercial/Enterprise

Custom multisig

Identifier

No response

Share a brief history of your project and organization

Currently in the world of NFT, many methods for obtaining information about blockchain are complex and difficult for users to use, making it difficult for the public to obtain information.
Furthermore, various blockchain project data is in a fragmented state currently, making it even more difficult to obtain or organize information. Web3Eye is a search engine that aggregates historical NFT transaction records, and provides multi-chain aggregation search for NFT assets.

Is this project associated with other projects/ecosystem stakeholders?

No

If answered yes, what are the other projects/ecosystem stakeholders

No response

Describe the data being stored onto Filecoin

Due to so many NFTs are stored in IPFS, or central server, sometimes when we view the exist NFT, we get error for the missed data. we think of it's valuable to make snapshot for those data. So in web3eye.io's engine, we parse the NFT history, generate snapshot for them, then store to Filecoin. When user search the data, and we know its original data is missed, then we'll recover from Filecoin for user, and let user know its original source is lost.

Where was the data currently stored in this dataset sourced from

My Own Storage Infra

If you answered "Other" in the previous question, enter the details here

No response

If you are a data preparer. What is your location (Country/Region)

Hong Kong

If you are a data preparer, how will the data be prepared? Please include tooling used and technical details?

After generate the snapshot by our cross-chain meta engine, the data will be cut into disks through Singularity, and then the hard disk is mailed to the SPs.
We also provide online method to fetch the data from our data source. If one of the copy is stored successfully, the original snapshot will be removed from the source.
We also implement a car generating tool to generate car file, then send to preset miners online.

If you are not preparing the data, who will prepare the data? (Provide name and business)

No response

Has this dataset been stored on the Filecoin network before? If so, please explain and make the case why you would like to store this dataset again to the network. Provide details on preparation and/or SP distribution.

No

Please share a sample of the data

# Tarball with a bunch of NFT data
-rw-rw-r-- 1 coast coast 2148532224  3月  6 13:04 fdd65bf6-db76-11ee-8b8f-e7daf58958d9.tar.gz
-rw-rw-r-- 1 coast coast 2148532224  3月  6 13:04 fdd6b2f4-db76-11ee-932f-b7e26d86e3b0.tar.gz
-rw-rw-r-- 1 coast coast 2148532224  3月  6 13:04 fdd70a56-db76-11ee-b873-476f4f5f23c0.tar.gz
-rw-rw-r-- 1 coast coast 2148532224  3月  6 13:04 fdd76366-db76-11ee-8b14-93209f72bdaa.tar.gz
-rw-rw-r-- 1 coast coast 2148532224  3月  6 13:04 fdd7c8ce-db76-11ee-9968-b73681ea7bfb.tar.gz
-rw-rw-r-- 1 coast coast 2148532224  3月  6 13:04 fdd8401a-db76-11ee-94a6-bb3bcf1b70ec.tar.gz
-rw-rw-r-- 1 coast coast 2148532224  3月  6 13:04 fdd8c2f6-db76-11ee-a1aa-7f7632305de8.tar.gz
-rw-rw-r-- 1 coast coast 2148532224  3月  6 13:04 fdd93b6e-db76-11ee-97f6-dbd9d8c0e1f5.tar.gz
-rw-rw-r-- 1 coast coast 2148532224  3月  6 13:04 fdd99cd0-db76-11ee-95ff-ebf3bb4da26a.tar.gz
-rw-rw-r-- 1 coast coast 2148532224  3月  6 13:04 fdda004e-db76-11ee-8034-8b110be82bb3.tar.gz

# Car file generated with above tarball
-rw-rw-r-- 1 coast coast 113246208  3月  6 13:04 fdd65bf6-db76-11ee-8b8f-e7daf58958d9.car
-rw-rw-r-- 1 coast coast 113246208  3月  6 13:04 fdd6b2f4-db76-11ee-932f-b7e26d86e3b0.car
-rw-rw-r-- 1 coast coast 113246208  3月  6 13:04 fdd70a56-db76-11ee-b873-476f4f5f23c0.car
-rw-rw-r-- 1 coast coast 113246208  3月  6 13:04 fdd76366-db76-11ee-8b14-93209f72bdaa.car
-rw-rw-r-- 1 coast coast 113246208  3月  6 13:04 fdd7c8ce-db76-11ee-9968-b73681ea7bfb.car
-rw-rw-r-- 1 coast coast 113246208  3月  6 13:04 fdd8401a-db76-11ee-94a6-bb3bcf1b70ec.car
-rw-rw-r-- 1 coast coast 113246208  3月  6 13:04 fdd8c2f6-db76-11ee-a1aa-7f7632305de8.car
-rw-rw-r-- 1 coast coast 113246208  3月  6 13:04 fdd93b6e-db76-11ee-97f6-dbd9d8c0e1f5.car
-rw-rw-r-- 1 coast coast 113246208  3月  6 13:04 fdd99cd0-db76-11ee-95ff-ebf3bb4da26a.car
-rw-rw-r-- 1 coast coast 113246208  3月  6 13:04 fdda004e-db76-11ee-8034-8b110be82bb3.car

# Original NFT data
-rw-rw-r-- 1 coast coast     72586  2月 24  2023 1000.webp
-rw-rw-r-- 1 coast coast    359239  9月 12 12:02 1109.png
-rw-rw-r-- 1 coast coast    375057  6月 18  2023 1_front.png
-rw-rw-r-- 1 coast coast   1417554  8月  7  2023 22_3.png
-rw-rw-r-- 1 coast coast     78969  8月 11  2023 4b560f2f-934c-43dc-9021-6f4d62fec19e
-rw-rw-r-- 1 coast coast     18252  6月 18  2023 613.png
-rw-rw-r-- 1 coast coast     22102  6月 18  2023 617.png
-rw-rw-r-- 1 coast coast   5081267  6月 18  2023 788.png
-rw-rw-r-- 1 coast coast    110899  2月 20  2023 8ce3abd5936a2ad5ffc03a27114eeb26.avif
-rw-rw-r-- 1 coast coast    803970  8月 21  2023 917590f0c295682acc5aa07f65d219db54b964d8.png
-rw-rw-r-- 1 coast coast   1413857  6月 18  2023 99.png
-rw-rw-r-- 1 coast coast    226886  8月 17  2023 body-7a-eyes-32b-clothes-80f-hairs-21e-earrings-21a.webp
-rw-rw-r-- 1 coast coast 209715200  3月  6 11:41 data.bin
-rw-rw-r-- 1 coast coast   1413857  6月 18  2023 eee.png
-rw-rw-r-- 1 coast coast     94281  6月 18  2023 he.jpeg
-rw-rw-r-- 1 coast coast  13306505  8月 17  2023 LiNHX_TiexGO0DRC4rUl3Fa9oH74Ny9vY_MYmbPWRoY.jpeg
-rw-rw-r-- 1 coast coast      7164  6月 18  2023 nn.jpeg
-rw-rw-r-- 1 coast coast    762051  9月 21 08:57 QmaMG7ygHJiXizKBwdddyKnPBBdRiYX5yac9uuQaBqB7mj.png
-rw-rw-r-- 1 coast coast   2706728  9月 15 16:37 QmXsX73NjzMKz62ukADqGGxAYcGdamrwUutWiFUp6JqR8Q.png

Confirm that this is a public dataset that can be retrieved by anyone on the Network

If you chose not to confirm, what was the reason

No response

What is the expected retrieval frequency for this data

Daily

For how long do you plan to keep this dataset stored on Filecoin

More than 3 years

In which geographies do you plan on making storage deals

Asia other than Greater China

How will you be distributing your data to storage providers

HTTP or FTP server, Shipping hard drives

How did you find your storage providers

Partners

If you answered "Others" in the previous question, what is the tool or platform you used

No response

Please list the provider IDs and location of the storage providers you will be working with.

| MinerID | Entity | City | Continent |
| --------- | ---------- | ---------- | --------- |
| f01970622 | Padeng  | HongKong | Asia |
| TBD(Our New Miner)   | web3eye.io | HongKong | Asia |
| f01146045   | Tianzhi | Singapore | Asia |
| f0832131  | Beishou | YangZhou,CN | Asia |

How do you plan to make deals to your storage providers

Boost client

If you answered "Others/custom tool" in the previous question, enter the details here

No response

Can you confirm that you will follow the Fil+ guideline

Yes

kevzak commented 8 months ago

@kikakkz can you please complete a KYC check of your GitHub ID at this link? https://filplus.storage/

login with your GitHub ID and click link to complete KYC.

datacap-bot[bot] commented 8 months ago

Application is waiting for governance review

datacap-bot[bot] commented 8 months ago

Datacap Request Trigger

Total DataCap requested

1PiB

Expected weekly DataCap usage rate

50TiB

Client address

f1pvixpahnuxfjp73ojo43oax5asoopobyqu4hmea

datacap-bot[bot] commented 8 months ago

DataCap Allocation requested

Multisig Notary address

Client address

f1pvixpahnuxfjp73ojo43oax5asoopobyqu4hmea

DataCap allocation requested

50TiB

Id

a40d9d68-71fd-4f4d-b05b-f61a68436893

datacap-bot[bot] commented 8 months ago

Application is ready to sign

kevzak commented 8 months ago

@kikakkz please complete this KYB form with details about the applicant and client: https://form.jotform.com/240786057753667

kikakkz commented 8 months ago

image

kevzak commented 8 months ago

@kikakkz can you please complete a KYC check of your GitHub ID at this link? https://filplus.storage/

login with your GitHub ID and click link to complete KYC.

I confirm KYC was completed https://github.com/filecoin-project/filecoin-plus-large-datasets/issues/2328#issuecomment-2008408733

kikakkz commented 8 months ago

image

For the website please wait, 😄. we're not online formally currently. The testnet is offline temporarily.

kikakkz commented 8 months ago

image

For the website please wait, 😄. we're not online formally currently. The testnet is offline temporarily.

https://testnet.web3eye.io is online again, :)

kevzak commented 8 months ago

I confirm KYB was submitted and deemed legit

datacap-bot[bot] commented 8 months ago

Request Approved

Your Datacap Allocation Request has been approved by the Notary

Message sent to Filecoin Network

bafy2bzaceawvoc3qhyzf7bgczh66xoflu5bd4ifk6vronokgw24b2kh45onke

Address

f1pvixpahnuxfjp73ojo43oax5asoopobyqu4hmea

Datacap Allocated

50TiB

Signer Address

f1v24knjbqv5p6qrmfjj5xmlaoddzqnon2oxkzkyq

Id

a40d9d68-71fd-4f4d-b05b-f61a68436893

You can check the status of the message here: https://filfox.info/en/message/bafy2bzaceawvoc3qhyzf7bgczh66xoflu5bd4ifk6vronokgw24b2kh45onke

datacap-bot[bot] commented 8 months ago

Application is Granted

datacap-bot[bot] commented 8 months ago

Application is in Refill

datacap-bot[bot] commented 8 months ago

DataCap Allocation requested

Multisig Notary address

Client address

f1pvixpahnuxfjp73ojo43oax5asoopobyqu4hmea

DataCap allocation requested

50TiB

Id

08507fa9-6633-4a19-9231-783a0d4584f5

datacap-bot[bot] commented 8 months ago

Application is ready to sign

kevzak commented 8 months ago

@kikakkz as part of data sample review needed, please provide access to the ftp for interactive exploration of the data. We cannot currently confirm what data you plan to store.

willscott commented 8 months ago

None of the provided SPs are currently providing indexes of their data to the network, or making data stored with them available for retrieval.

I do not have confidence that this data will be made available for retrieval and cannot recommend providing any subsequent datacap until we can confirm that deals made with the initial datacap indeed is made available for retrieval.

kikakkz commented 8 months ago

@kikakkz as part of data sample review needed, please provide access to the ftp for interactive exploration of the data. We cannot currently confirm what data you plan to store.

we don't have a ftp for data sample currently due to it's delivered offline. we can prepare ftp access with our data.

kikakkz commented 8 months ago

None of the provided SPs are currently providing indexes of their data to the network, or making data stored with them available for retrieval.

I do not have confidence that this data will be made available for retrieval and cannot recommend providing any subsequent datacap until we can confirm that deals made with the initial datacap indeed is made available for retrieval.

thanks to reply. we'll communicate to all SPs to let the data be available for retrieval before creating deal.

kikakkz commented 7 months ago

Hey @kevzak , sorry to late information. We managed to deploy a public frontend for our original data directory, 😄. You can access through http://data.testnet.web3eye.io:21213/buckets/car/browse (sorry we don't setup https currently, we'll do that later then you can access through link http://data.testnet.web3eye.io/buckets/car/browse without port). I'll email access user and password to you privately.

As you can see in the following screenshot, we store target car file in car bucket, and original tar ball in tar bucket. Original token image which is parsed from blockchain is in token-image bucket.

image

Currently we only support index data from ethereum and solana network. In future we'll support more blockchain.

kevzak commented 7 months ago

Ok, thanks for sharing @kikakkz - we will review after you use the initial 50TiBs DataCap to see deals and data stored.

willscott commented 7 months ago

Your data preparation pipeline, to the extent that you have described it, will not be effective in making this content available.

The process from the shared files is that you are taking underlying resources of this data set, running tar gzip over them, and then wrapping the resulting .tgz file in a car wrapper for storage. Since you are compressing the individual assets, they will not be parsed or made available to the network, and it is unclear how storage of deals in this format will lead to the content being preserved, discoverable, or available to IPFS or filecoin users.

The tgz step is providing minimal compression of the image data (in the example provided, the source assets fit in the same deal size when directly converted into a car file as when the tgz compression step is applied first.

kikakkz commented 7 months ago

Your data preparation pipeline, to the extent that you have described it, will not be effective in making this content available.

The process from the shared files is that you are taking underlying resources of this data set, running tar gzip over them, and then wrapping the resulting .tgz file in a car wrapper for storage. Since you are compressing the individual assets, they will not be parsed or made available to the network, and it is unclear how storage of deals in this format will lead to the content being preserved, discoverable, or available to IPFS or filecoin users.

The tgz step is providing minimal compression of the image data (in the example provided, the source assets fit in the same deal size when directly converted into a car file as when the tgz compression step is applied first.

this file is for our application https://testnet.web3eye.io. it's a backup storage of our cross chain content indexer. user access our application will get data from filecoin storage if the original data is missed. you describe the steps correct, 😄, they're generated with our cross chain analysis engine.

image

for example, if the original data is missing, we'll mark it in our application, then user can get the backup file from filecoin storage (may pay or free). our application will let user know that this is not the original data. currently we only support images, and in future we'll support other content: videos, audios, articles. so sure, it's not for IPFS or filecoin users, it's for our application users, 😄

kevzak commented 6 months ago

checker:manualTrigger

datacap-bot[bot] commented 6 months ago

DataCap and CID Checker Report[^1]

No active deals found for this client.

[^1]: To manually trigger this report, add a comment with text checker:manualTrigger

kevzak commented 5 months ago

checker:manualTrigger

datacap-bot[bot] commented 5 months ago

DataCap and CID Checker Report[^1]

No active deals found for this client.

[^1]: To manually trigger this report, add a comment with text checker:manualTrigger

kevzak commented 3 months ago

closing as inactive for 3 months