filecoin-project / filecoin-plus-large-datasets

Hub for client applications for DataCap at a large scale
110 stars 62 forks source link

[DataCapEpeng Times Technology Ltd Application 2nd batch] #357

Closed Epengtimes closed 2 years ago

Epengtimes commented 2 years ago

Large Dataset Notary Application

To apply for DataCap to onboard your dataset to Filecoin, please fill out the following.

Core Information

Please respond to the questions below by replacing the text saying "Please answer here". Include as much detail as you can in your answer.

Project details

Share a brief history of your project and organization.

Epeng Times Technology Ltd is a cryptocurrency market research company providing consolidated market insights and analyses drawn from on-chain data.

What is the primary source of funding for this project?

Investment from shareholders.

What other projects/ecosystem stakeholders is this project associated with?

We're not associated with other projects.

Use-case details

Describe the data being stored onto Filecoin

Consolidated cryptocurrency market data from various public information source, such as DefiLIama, Etherscan, Trading View, DexGuru, etc.

Where was the data in this dataset sourced from?

Market information from web crawler tools.

Can you share a sample of the data? A link to a file, an image, a table, etc., are good ways to do this.

Yes.
https://www.dropbox.com/s/f29ignxvy9jjwnm/sample_epeng.json?dl=0

Confirm that this is a public dataset that can be retrieved by anyone on the Network (i.e., no specific permissions or access rights are required to view the data).

Yes, it's public data can be retrieved and used by anyone on the Filecoin network.

What is the expected retrieval frequency for this data?

It depends on users' demands.

For how long do you plan to keep this dataset stored on Filecoin?

We expect to store the data permanently.

DataCap allocation plan

In which geographies (countries, regions) do you plan on making storage deals?

All regions. 

How will you be distributing your data to storage providers? Is there an offline data transfer process?

Yes, we will transfer data both online and offline.

How do you plan on choosing the storage providers with whom you will be making deals? This should include a plan to ensure the data is retrievable in the future both by you and others.

We plan to run testing first with different storage providers. We prefer storage providers who have good track record in storing verified data.

How will you be distributing deals across storage providers?

We will distribute deals faily among miners in all regions.

Do you have the resources/funding to start making deals as soon as you receive DataCap? What support from the community would help you onboard onto Filecoin?

Yes, we have enough resources to start making deals. 
large-datacap-requests[bot] commented 2 years ago

Thanks for your request!

Heads up, you’re requesting more than the typical weekly onboarding rate of DataCap!
large-datacap-requests[bot] commented 2 years ago

Thanks for your request! Everything looks good. :ok_hand:

A Governance Team member will review the information provided and contact you back pretty soon.

Sunnyiscoming commented 2 years ago

@Epengtimes Hi,

  1. Please provide the detailed organization name by using the language of your country.
  2. Please provide website and social media of organization or project. If you don't have a social media account, you can provide other related reports links.
  3. Please provide more data samples to indicate you have 5 PB data already.
Sunnyiscoming commented 2 years ago

Any update here?

Epengtimes commented 2 years ago

@Sunnyiscoming , sorry for late. our name is EPENG TIMES TECHNOLOGY LTD, incorporated in Cayman Islands. we have office in Singapore. I can provide company registration material if needed. As we don't have to C business, no PR requirement so there is no website or other links can be provided.

we don't have 1 PiB raw data in hand ( 5PiB DataCap takes 1 PiB raw data generally), since we don't have the capacity to store so huge data. that is the reason for DataCap application. based on our business requirements, we have PiB scale raw data this year.

Sunnyiscoming commented 2 years ago

@Epengtimes I think Datacap is set up for an existing dataset. So I want to know how much data you have?

Epengtimes commented 2 years ago

@Epengtimes I think Datacap is set up for an existing dataset. So I want to know how much data you have?

@Sunnyiscoming There are about 500TiB raw data need to be sealed into Filecoin now. And will be another 500 TiB new data in next 3 months as the increasing rate is about 170TiB/month, thanks!

Sunnyiscoming commented 2 years ago

@Epengtimes Sorry for slowly update. You should adjust the Total amount of DataCap being requested from 5PiB to 500 TiB, because Datacap is set up for an existing dataset.

Sunnyiscoming commented 2 years ago

Otherwise, please provide the proof of organization existence, and to provide more data samples and detailed data source to prove you have 5 PB data already.

Epengtimes commented 2 years ago

@Epengtimes Sorry for slowly update. You should adjust the Total amount of DataCap being requested from 5PiB to 500 TiB, because Datacap is set up for an existing dataset.

@Sunnyiscoming, thanks for response!

  1. Based on LDN guidelines, clients will store up to 10 replicas of each piece of data to keep data safe. 500TiB raw data need 5 PiB DataCap at most.
  2. Anyway, both 500 TiB and 5 PiB DataCap are fine to us. The problem we are facing is that the LDN process is out of control. This application was created 2 months ago, and there are only 2 round communications in 2 months. we really can't put resource on unpredictable things.
  3. As I mentioned before, we can provide company registration material. If needed, we can schedule a video conference to check those materials with you.
Sunnyiscoming commented 2 years ago

I will talk about that with governance team. Hope to solve your problem quickly.

dkkapur commented 2 years ago

@Epengtimes thanks - your application is still missing a client address. Will you be using the same one?

Separately, I'd like to understand your SP selection mechanism a bit better. Here are the stats from your deal distribution: https://filplus.d.interplanetary.one/clients/f01624861/breakdown (please let us know if this is not right). image

From what I see in your dealmaking, a majority of your data is stored with these 4 SP IDs:

  1. f01602479
  2. f01606675
  3. f01606849
  4. f01641612

For these 4 SPs, some things I noticed:

This leads me to believe that they are likely owned by the same entity. Is that not accurate? How did you find these SP IDs?

I'd like to see some details on your plan for this. Which SPs do you plan to work with now that you've already made several deals on chain and want to store 10x replicas for your data?

We will distribute deals faily among miners in all regions.

raghavrmadya commented 2 years ago

@Epengtimes , waiting for a response. If we don't hear back by Friday, we might have to close the issue.

Epengtimes commented 2 years ago

Hi @dkkapur @raghavrmadya The 4 major SPs are operated by the same technical team, I am not sure if owned by the same business entity, they were introduced by our BD team.

We are trying to find more distributed SPs to store our data. It's difficult to find very appropriate partner for this, because need much workload for cooperating , also need good network condition between us, our team is very busy in production business, so the process is slow.

And as backup&archive , we think the major 4 SPs are safe enough, also got proof in the past year.

For the plan for this batch, we'd like to use these 4 SPs to start, and continue find more SPs, if this is inconformity to the rule, we could close this application temporary, and will re-open when we have good SP partners.

Thanks!

raghavrmadya commented 2 years ago

Storing all copies in potentially the same location/data center defeats the point of decentralized storage, even if you are dealing with 4 different storage providers. We would either recommend finding providing proof that the copies are not being stored in the same location or finding other SPs that are more diverse.

raghavrmadya commented 2 years ago

Please let us know how you would like to proceed and we can either close this app or wait for you.

raghavrmadya commented 2 years ago

@Epengtimes , closing this application as per your suggestion. Please re-open with good SP partners. You can always reach out to SPs on Filecoin Slack, BigDexchange, and Filgram. Please make sure to link this application when you open a new one to move forward quickly.

Thanks