filecoin-project / notary-governance

114 stars 58 forks source link

Removal of Notaries from the Fil+ Program & DC from RongYin Project #900

Closed Chris00618 closed 1 year ago

Chris00618 commented 1 year ago

Issue Description

RongYIn Open Data Project involves:

The following notaries colluded with the client, disregarded DC report and retrieval data:

In application #2050:

Proposed Solution(s)

Removal of notaries from the Fil+ Program and DC allocated to RongYin open data project

Timeline

Discussion at June 27th 2023 WG Call Community review until June 30th 2023 at 5 pm ET Decision announced by the T&T WG Lead after

Technical dependencies

Removal from main multisig

Related Issues

https://github.com/filecoin-project/filecoin-plus-large-datasets/issues/1579 https://github.com/filecoin-project/filecoin-plus-large-datasets/issues/1580

kernelogic commented 1 year ago

I signed these applications in February and they were the first round, well before the retrieval bots went online, or any retrievals possible for that matter.

I think you should be careful using the "collude" word, it is a personal attack without concrete evidence.

herrehesse commented 1 year ago

@kernelogic and @Chris00618, I agree with both of you. As a notary who signed before retrievals were possible, I can attest that the landscape has changed significantly since then. However, it is important not to use that as an excuse to hide from the current situation.

Now, the question for @kernelogic is: If you thoroughly investigate this client/application at present, what are your thoughts on the matter?

kernelogic commented 1 year ago

The CID sharing definitely need some explanation.

Chris00618 commented 1 year ago

I signed these applications in February and they were the first round, well before the retrieval bots went online, or any retrievals possible for that matter.

@kernelogic You also signed on April 20. You commented "DD performed perviously. CID checker result looks good." https://github.com/filecoin-project/filecoin-plus-large-datasets/issues/1580#issuecomment-1515667961 to this report https://github.com/filecoin-project/filecoin-plus-large-datasets/issues/1580#issuecomment-1514285197 image

Why did you think this was enough to get your signature and now come to ask

The CID sharing definitely need some explanation.

The evidence is clear, you failed your notary duty and I want to know why. Show your due diligence is the right way, accusing me won't help anyhow.

I think you should be careful using the "collude" word, it is a personal attack without concrete evidence.

Chris00618 commented 1 year ago

@mikezli @sxxfuture-official @NiwanDao @1ane-1 @fireflyHZ @luobin544 @AlanGreaterheat

Carohere commented 1 year ago

Valid points. I kindly request all notaries involved to engage in this discussion for community transparency 🙏

kernelogic commented 1 year ago

Every notary can have their own DD process and standard, in my "career" of notary, if there is some minimal CID sharing, I would allow it. It usually caused by technical accidents on the client side when making cars.

In this particular case, there are 3 CID sharings. The first one 6.03 TB is large enough to raise a concern however it belongs to the same series, therefore it is completely normal. The other two 864GB and 672GB are small enough (to my eyes) to allow.

image

I don't think there's any zero tolerance policy on CID sharing. If it's minimal, accidental, or well explained, I usually allow it. If you don't like my style, you can request to remove every notary ever signed on any CID sharing LDNs see what the community says.

datalove2 commented 1 year ago

Hi, I would like to provide some responses regarding our LDN application:

Regarding the issue with data retrieval, this is due to a mismatch in the retrieval commands between our collaborating SP and the checker. I believe many encapsulated codes might encounter similar issues if they do not utilize boost to submit requests. We have already optimized our program and conducted tests to ensure successful data retrieval by the check bot.

Concerning cid sharing, we have addressed this matter when communicating with notaries. As kernelogic mentioned, minimal data sharing occurs unintentionally during data processing because the collaborating sp may receive deals from multiple clients.

Furthermore, from my perspective as a client, it is unreasonable to request the removal of notaries based on technical issues (which have already been optimized). They are active members of the community, and it is their activity that allows me to contact them on Slack. Many of the notaries actually request me to provide the CID for their reviewing process. Isn't it an acceptable way? Additionally, since the bot for retrieval was just released a few weeks, there were no clear rules or instructions given to the clients. However, we still actively cooperate with the latest updates.

Kindly reminder. Please close this proposal in case bothering others.

herrehesse commented 1 year ago

Requesting the closure of an open issue based on the premise of "bothering others" is not a valid reason. Additionally, your application claims to store merged datasets, which is not acceptable. I recommend that @raghavrmadya promptly close this application, as evidenced in the following link: https://github.com/filecoin-project/notary-governance/issues/832

When it comes to open public datasets, they should be stored individually as separate LDNs and must be fully retrievable for the public.

Based on the information provided, your application exhibits the following issues:

datalove2 commented 1 year ago

Most of the deals are being synchronized on the chain, which is not what you said.

YuanHeHK commented 1 year ago

Hi @Chris00618 , You say that these notaries "collude" with the client, I do not agree.

First, you talk about the problem of sharing data with multiple LDNs. When I signed, this LDN did have a small amount of CID overlap with other LDNs. An 864G and a 672G are not serious compared to a few P's. In my perception as a notary, I think it's acceptable.

Second, the 0 retrieval rate problem you mentioned is currently from the search report of the feedback of the new retrieval robot, almost all LDNs have this problem. I think this is a technical problem, and the client can be adjusted to rectify and adjust. And I also tried to select a miner to retrieve through the original command when signing, at least at that time, it met the search requirements.

Third, when you say 'High deal data replication: 82.82%,' I don't know what you're trying to say, so please express it clearly before I can explain it to you.

Finally, the biggest goal of fil+ as I understand it is to attract real and effective data to achieve filecoin on-chain storage, followed by the storage specifications that comply with fil+. The dataset of #1580 is itself an open source dataset, and its value and significance are beyond doubt. It may be flawed in the FIL+ specification, but it is not fatal. I think as long as the client can give a reasonable explanation.

Chris00618 commented 1 year ago

Before going through these notaries' signing behavior, Let's remember this is an Agriculture dataset application sharing CID with multiple different open dataset applications and a FIL-E application .

In https://github.com/filecoin-project/filecoin-plus-large-datasets/issues/1579,

@flyworker @luobin544 signed after image

@1ane-1 signed after image

@NiwanDao signed after image

@sxxfuture-official @mikezli signed after image CID sharing with a FIL-E application https://github.com/filecoin-project/filecoin-plus-large-datasets/issues/1155

Chris00618 commented 1 year ago

In https://github.com/filecoin-project/filecoin-plus-large-datasets/issues/1580,

@sxxfuture-official @zcfil @mikezli signed after image

Chris00618 commented 1 year ago

In https://github.com/filecoin-project/filecoin-plus-large-datasets/issues/2050,

@AlanGreaterheat signed without checking the client's past allocation history, no questions asked, his DD proof image @1ane-1 signed many times in other two applications, he is aware of the client's past allocation history but still signed anyway, his DD proof image

cryptowhizzard commented 1 year ago

@Chris00618 Thank you for providing these clear statistics which we can use as evidence. Love to hear the opinion from T&T @raghavrmadya.

YuanHeHK commented 1 year ago

In filecoin-project/filecoin-plus-large-datasets#1580,

@sxxfuture-official @zcfil @mikezli signed after image

I've only signed #1580 and the state I signed is as I explained earlier. At least at the time I thought it was in line with the signature standards I understood.

截图 2023-07-04 21-24-03

截图 2023-07-04 21-24-38

Chris00618 commented 1 year ago

Hi @Chris00618 , You say that these notaries "collude" with the client, I do not agree.

First, you talk about the problem of sharing data with multiple LDNs. When I signed, this LDN did have a small amount of CID overlap with other LDNs. An 864G and a 672G are not serious compared to a few P's. In my perception as a notary, I think it's acceptable.

It has CID sharing with 5 NDLABS applications. Doesn't this seem strange to you? Have you checked the report? image

Second, the 0 retrieval rate problem you mentioned is currently from the search report of the feedback of the new retrieval robot, almost all LDNs have this problem. I think this is a technical problem, and the client can be adjusted to rectify and adjust. And I also tried to select a miner to retrieve through the original command when signing, at least at that time, it met the search requirements.

I'm ok with this since you signed before the retrieval report launch.

Third, when you say 'High deal data replication: 82.82%,' I don't know what you're trying to say, so please express it clearly before I can explain it to you.

Finally, the biggest goal of fil+ as I understand it is to attract real and effective data to achieve filecoin on-chain storage, followed by the storage specifications that comply with fil+. The dataset of #1580 is itself an open source dataset, and its value and significance are beyond doubt. It may be flawed in the FIL+ specification, but it is not fatal. I think as long as the client can give a reasonable explanation.

I don't know which part you don't understand. image

Please see for yourself https://github.com/data-preservation-programs/filplus-checker-assets/blob/main/filecoin-project/filecoin-plus-large-datasets/issues/1580/1681890322589.md

Chris00618 commented 1 year ago

The most recent report of #1579 https://github.com/data-preservation-programs/filplus-checker-assets/blob/main/filecoin-project/filecoin-plus-large-datasets/issues/1579/1686303180588.md has 9.44T+2.38T+1.84T+1.75T+800G+576G+224G+32G CID sharing with NDLABS and 7.91T CID sharing with a FIL-E application Wel Vape.

image

The most recent report of #1580 https://github.com/data-preservation-programs/filplus-checker-assets/blob/main/filecoin-project/filecoin-plus-large-datasets/issues/1580/1686303190748.md has 6.56TiB+992GiB CID sharing with two Hepta applications #1693 and #1731. #1693 has more than 10TiB CID sharing and #1731 has 7TiB CID sharing with NDLABS application image https://github.com/data-preservation-programs/filplus-checker-assets/blob/main/filecoin-project/filecoin-plus-large-datasets/issues/1693/1685673968369.md https://github.com/data-preservation-programs/filplus-checker-assets/blob/main/filecoin-project/filecoin-plus-large-datasets/issues/1731/1688389949946.md

Clear violations I found from the report, notaries didn't do due diligence and NDLABS is involved in all applications. One time cid sharing is a coincidence, ten times is not. Notaries signed on these applications and DC granted should be moved.

NDLABS-Leo commented 1 year ago

@Chris00618 Regarding the data sharing with ND LABS, we have provided explanations in our application and have addressed this issue multiple times. It becomes repetitive to reiterate the same historical issues whenever a question arises. If necessary, we will explain it during the TT meeting, but we will refrain from providing further responses here.

Chris00618 commented 1 year ago

https://filecoinproject.slack.com/archives/C0405HANNBT/p1688533770165699?thread_ts=1688370299.699409&cid=C0405HANNBT

raghavrmadya commented 1 year ago

Thanks. It is most reasonable that DC be removed from clients these notaries have supported. Based on this discussion, the T&T WG is will be recommending increased scrutiny on the aforementioned notaries and suggests that this issue be taken into account if the assessment of their notary applications in the next round.

cc @Kevin-FF-USA . Please take a note of the notaries list provided by @Chris00618

kernelogic commented 1 year ago

It is most reasonable that DC be removed from clients these notaries have supported.

@raghavrmadya Do you mean this RongYin project only, or every other LDNs we signed ever are going to get removed?

kernelogic commented 1 year ago

Seems all the previous explanations - are all ignored and will face increased scrutiny instead.