filecoin-project / filecoin-plus-large-datasets

Hub for client applications for DataCap at a large scale
110 stars 62 forks source link

[DataCap Application]Samdata Trade #382

Closed Samdatatrade1 closed 1 year ago

Samdatatrade1 commented 2 years ago

Large Dataset Notary Application

To apply for DataCap to onboard your dataset to Filecoin, please fill out the following.

Core Information

Please respond to the questions below by replacing the text saying "Please answer here". Include as much detail as you can in your answer.

Project details

Share a brief history of your project and organization.

Matrix mainly takes quantitative investment transaction  as the core , and has the quantitative investment management system with asset portfolio management and other functions, creating a set of full-stack quantitative trading solutions for professional quantitative institutions and quants.

What is the primary source of funding for this project?

Company's revenues and venture investors.

What other projects/ecosystem stakeholders is this project associated with?

Nothing else.

Use-case details

Describe the data being stored onto Filecoin

1.  Data Type MangoDB
2.  Data stored on Filecoin is mainly from Quotation data, such as market reference data, options data, index data, fund data, futures data, bond data.

Where was the data in this dataset sourced from?

The data comes from crawler data in the world's major crypto currency CEX .

Can you share a sample of the data? A link to a file, an image, a table, etc., are good ways to do this.

[AMPUSDT_daily.csv](https://github.com/filecoin-project/filecoin-plus-large-datasets/files/8851303/AMPUSDT_daily.csv)
[BORAUSDT_daily.csv](https://github.com/filecoin-project/filecoin-plus-large-datasets/files/8851305/BORAUSDT_daily.csv)

[CELOUSDT_daily.csv](https://github.com/filecoin-project/filecoin-plus-large-datasets/files/8851308/CELOUSDT_daily.csv)

Confirm that this is a public dataset that can be retrieved by anyone on the Network (i.e., no specific permissions or access rights are required to view the data).

Yes, we confirm that the data is public and can be retrieved by anyone on the network.

What is the expected retrieval frequency for this data?

Once a year.

For how long do you plan to keep this dataset stored on Filecoin?

At least 3 years.

DataCap allocation plan

In which geographies (countries, regions) do you plan on making storage deals?

We plan to make deals across Asia.

How will you be distributing your data to storage providers? Is there an offline data transfer process?

Both online and offline.

How do you plan on choosing the storage providers with whom you will be making deals? This should include a plan to ensure the data is retrievable in the future both by you and others.

We will negotiate with the miners to enable fast retrieval.

How will you be distributing deals across storage providers?

At least 5 SP.

Do you have the resources/funding to start making deals as soon as you receive DataCap? What support from the community would help you onboard onto Filecoin?

Yes, we have enough funding.I want good technical support and SP.
kevzak commented 1 year ago

checker:manualTrigger

filplus-checker-app[bot] commented 1 year ago

DataCap and CID Checker Report Summary[^1]

Retrieval Statistics

⚠️ All retrieval success ratios are below 1%.

Storage Provider Distribution

⚠️ 13 storage providers sealed too much duplicate data - f01889046: 42.80%, f01914977: 46.91%, f01887652: 89.83%, f01926914: 77.55%, f01926585: 79.29%, f01917214: 25.81%, f01917241: 25.47%, f01917257: 25.83%, f01889480: 83.52%, f01924350: 71.71%, f01937964: 81.14%, f01895990: 32.38%, f01895778: 33.05%

⚠️ 1 storage providers have unknown IP location - f02207907

Deal Data Replication

⚠️ 100.00% of deals are for data replicated across less than 4 storage providers.

Deal Data Shared with other Clients[^3]

⚠️ CID sharing has been observed. (Top 3)

[^1]: To manually trigger this report, add a comment with text checker:manualTrigger

[^2]: Deals from those addresses are combined into this report as they are specified with checker:manualTrigger

[^3]: To manually trigger this report with deals from other related addresses, add a comment with text checker:manualTrigger <other_address_1> <other_address_2> ...

Full report

Click here to view the CID Checker report. Click here to view the Retrieval report.

github-actions[bot] commented 1 year ago

This application has not seen any responses in the last 10 days. This issue will be marked with Stale label and will be closed in 4 days. Comment if you want to keep this application open.

github-actions[bot] commented 1 year ago

This application has not seen any responses in the last 14 days, so for now it is being closed. Please feel free to contact the Fil+ Gov team to re-open the application if it is still being processed. Thank you!

Sunnyiscoming commented 10 months ago

Hello, @Samdatatrade1 per the https://github.com/filecoin-project/notary-governance/issues/922 for Open, Public Dataset applicants, please complete the following Fil+ registration form to identify yourself as the applicant and also please add the contact information of the SP entities you are working with to store copies of the data.

This information will be reviewed by Fil+ Governance team to confirm validity and then the application will be allowed to move forward for additional notary review.

aggregation-and-compliance-bot[bot] commented 9 months ago
Client f01853599 does not follow the datacap usage rules. More info here. This application has been failing the requirements for 7 days. Please take appropiate action to fix the following DataCap usage problems. Criteria Treshold Reason
Cid Checker score > 25% The client has a CID checker score of 14%. This should be greater than 25%. To find out more about CID checker score please look at this issue: https://github.com/filecoin-project/notary-governance/issues/986
Shared data percent < 20% 23.78% of the clients data is shared with other clients. This should be less than 20%