Open slingerbv opened 2 years ago
The SearchSECO project wants to provide access to all software methods on Github, by parsing each method and storing a hash for it. The hashes are stored in our Cassandra database. The SearchSECO Controller nodes (or just nodes, or miners), do the following:
So basically, the miners are mining methods all day. You can find the results on our portal.
The process breaks in step 4, when it tries to find the authors. What we need to know is:
If the error is reproducible: great! You can fix it.
If the error is not reproducible: bad! We don't know what's going on. But the second question is then still interesting: can we test for the wrong folder?
I know what the problem is but it appears the gitcoin issue has expired, can you reopen
Sure, what do you think the problem is? I had already asked someone else to work on this Gitcoin bounty, I think.
This typically happens when the file's parent directory had it's name changed with the same spelling
For example, if the parent folder was originally src/Compilers/CSharp then changed to Src/Compilers/CSharp, it will cause this issue
So what should we do with this issue? Close it? Will it happen again?
Slinger
On Tue, Oct 4, 2022 at 7:13 PM RandomRandomnessAnon < @.***> wrote:
This typically happens when the file's parent directory had it's name changed with the same spelling
For example, if the parent folder was originally src/Compilers/CSharp then changed to Src/Compilers/CSharp, it will cause this issue
— Reply to this email directly, view it on GitHub https://github.com/SecureSECO/SearchSECOController/issues/19#issuecomment-1267311423, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAERHLHS4G2CV3TABX34AQLWBRQUHANCNFSM567SFZWA . You are receiving this because you authored the thread.Message ID: @.***>
-- dr. Slinger Jansen (Roijackers) Do you want to secure software ecosystems https://secureseco.org/ with us? Software Production Research Group https://www.uu.nl/en/research/software-systems/organization-and-information , Utrecht University http://www.slingerjansen.nl +31 6 19 884 880 book me through YouCanBook.me http://slingerroijackers.YouCanBook.me
You should be able to close it, shouldn't happen again as long as no folder accidentally gets renamed. If it does happen again and a folder wasn't renamed, you can go to the .vs folder and delete the .suo file (it's hidden) and restart computer and that should fix. It's rarely caused by Visual Studio having something go wrong while closing a solution/project. Is there a bounty for this fix?
When a folder gets renamed when using Windows, with same spelling but the case changed (For example renaming a folder myProject to MyProject), the operating system treats them as the same folder but Visual Studio (specifically MSBuild) treats them as different folders
Yes, that is correct.
You are very welcome. Do you want to reopen issue or you could send ETH to me at this address: 0x09f939d15604899B2F5b7Ae832a57E583CbE0EfA
I'll start working on that now
What version of visual studio are you using?
After further inspection, the issue is slightly different. It's not actually msbuild causing the issue but git blame. Will have an example shortly. The issue is still caused by directory renaming
@abebeos, @slingerbv
This will reproduce the issue from the command line (NOTE, the issue is with git; not visual studio, msbuild or your code):
` mkdir C:\FatalErrorDemo cd C:\FatalErrorDemo git init mkdir Directory1 cd Directory1 copy NUL FileExample1.txt git add FileExample1.txt git commit -m "File1" copy NUL FileExample2.txt git add FileExample2.txt git commit -m "File2" cd..
--At this point, rename Directory1 to directory1 using Windows Explorer
--Attempt to do a git blame with new directory name it won't work-- git blame directory1/FileExample1.txt
--Attempt to do a git blame with old directory name and it will work-- git blame Directory1/FileExample1.txt
`
Somehow, after https://github.com/dotnet/roslyn was downloaded, a folder name was changed or there was an issue with with the roslyn repository
If you look at the repository, it is https://github.com/dotnet/roslyn/tree/main/src and your application is looking for Src
Simple solution if this happens again with any repo you download and parse, redownload it. If that doesn't fix it, then the issue is with the repository you are trying to download and parse
That's the thing, there is nothing wrong with the SearchSECOController. In this specific case, the issue was with the repository located at https://github.com/dotnet/roslyn when the SearchSECOController downloaded it.
And I see nowhere in the SearchSECOControlle where the code is renaming it to Src and if it was, it'd break on every repository. I do see in the Roslyn repository they have done a massive about of folder renaming and moving over the past year
You can reproduce it in the SearchSECOController by using my example, committing it to a test repository, and then having the SearchSECOController try to download and parse the repository
I have 10+ years doing freelance/bounty type work. This is a very odd set of circumstances with this bug because it has nothing to do with your code/project
Again, the issue is with the repository you are pointing towards. It is their codebase that was broken at the time the error occurred. It is not an issue with SearchSECOController.
I'll try to come up with a way that maybe better explains it and look into if GitHub has put a fix in on their end in general to prevent this error from occurring as well. How often do these errors occur and have they ever occurred with a repository other than Roslyn. Have a good day/night and look forward to speaking again.
I have combed through literally 100's and 100's if not over a thousand commits in the timeframe around your issue for Roslyn occurred and have found only one other possibility, They have moved so many files and so many directories and changed naming for directories. and moved directories and SearchSECOController attempted to download at the wrong time after a bad commit to Roslyn.
If the sourcecode/repo you are trying to download and parse is broken, there isn't much you can do. Another small possibility is that when you did the download and parse, not all files fully downloaded due to the size of Roslyn, but I find that exceptionally unlikely and the time to setup an example to prove it would take 30-40 hours.
This scenario is like you driving a car, and someone else crashes into you and wanting to know what your car did wrong. Your car did nothing wrong, the other car was in the wrong. This scenario has ONLY ever been documented to occur with bad directory change when commiting, or change the folder name locally (there isn't a single place in your code where this is possible)
Hi there, sorry for my shitty replies. The thing is: I saw this happen twice. But your point is perfectly valid and I'm very happy with the explanation.
I hope you're doing well. Thanks for paying attention to our efforts: things have been moving along nicely, although the SearchSECO part of the project has been giving some headaches (especially managing Cassandra has been annoying).
In part there is some integral quality problem, so if you want to look at some tickets, here are the current urgent ones that give me the main headaches: https://gitcoin.co/issue/29394 https://gitcoin.co/issue/29422 https://gitcoin.co/issue/29421 https://gitcoin.co/issue/29420
I'm sure there are some others open, but these are the ones I expect will make my life much better in the short run.
Slinger
On Thu, Oct 6, 2022 at 5:17 AM RandomRandomnessAnon < @.***> wrote:
I have combed through literally 100's and 100's if not over a thousand commits in the timeframe around your issue for Roslyn occurred and have found only one other possibility, They have moved so many files and so many directories and changed naming for directories. and moved directories and SearchSECOController attempted to download at the wrong time after a bad commit to Roslyn.
If the sourcecode/repo you are trying to download and parse is broken, their isn't much you can do. Another small possibility is that when you did the download and parse, not all files fully downloaded due to the size of Roslyn, but I find that exceptionally unlikely.
This scenario is like you driving a car, and someone else crashes into you and wanting to know what your car did wrong. Your car did nothing wrong, the other car was in the wrong.
— Reply to this email directly, view it on GitHub https://github.com/SecureSECO/SearchSECOController/issues/19#issuecomment-1269252344, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAERHLGI5TQVOQKKXLS2BIDWBZAENANCNFSM567SFZWA . You are receiving this because you were mentioned.Message ID: @.***>
-- dr. Slinger Jansen (Roijackers) Do you want to secure software ecosystems https://secureseco.org/ with us? Software Production Research Group https://www.uu.nl/en/research/software-systems/organization-and-information , Utrecht University http://www.slingerjansen.nl +31 6 19 884 880 book me through YouCanBook.me http://slingerroijackers.YouCanBook.me
Thanks @slingerbv I'll look at the tickets you mentioned tomorrow and try to get to the new issue for the followup ticket to this ticket today @abebeos created.
Did my explanation and methodology for what happened and showing how it happened satisfy the criteria for the bounty for this gitcoin ticket: https://gitcoin.co/issue/29252
Issue Status: 1. Open 2. Started 3. Submitted 4. Done
This issue now has a funding of 99.9001 USD attached to it as part of the SecureSECO fund.
Yeah, you're right. I think I should open the bounty for RRA.
On Thu, Oct 6, 2022 at 3:57 PM abebeos @.***> wrote:
@RandomRandomnessAnon https://github.com/RandomRandomnessAnon , can you please quickly post the cli command for SearchSECOController fetch of a repo (i like to avoid to look into the docs now, as I'm busy with another task).
As for the bounty, I believe that you should have it, a fix would be beyond the scope of this bounty (final decision @slingerbv https://github.com/slingerbv ).
— Reply to this email directly, view it on GitHub https://github.com/SecureSECO/SearchSECOController/issues/19#issuecomment-1270100073, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAERHLFSGKGO7E3GEP7XZHTWB3LD5ANCNFSM567SFZWA . You are receiving this because you were mentioned.Message ID: @.***>
-- dr. Slinger Jansen (Roijackers) Do you want to secure software ecosystems https://secureseco.org/ with us? Software Production Research Group https://www.uu.nl/en/research/software-systems/organization-and-information , Utrecht University http://www.slingerjansen.nl +31 6 19 884 880 book me through YouCanBook.me http://slingerroijackers.YouCanBook.me
Issue Status: 1. Open 2. Started 3. Submitted 4. Done
Work has been started.
These users each claimed they can complete the work by 1 month from now. Please review their action plans below:
1) randomrandomnessanon has been approved to start work.
Task will be completed by analyzing the errors, looking/debugging the codebase and providing a detailed explanation of the cause.
Learn more on the Gitcoin Issue Details page.
@abebeos I will once I get back to my computer. I'm currently on my phone.
Issue Status: 1. Open 2. Started 3. Submitted 4. Done
Work for 99.9 USD (100.00 USD @ $1.0/USD) has been submitted by:
@slingerbv please take a look at the submitted work:
Issue Status: 1. Open 2. Started 3. Submitted 4. Done
The funding of 99.9 USD (100.00 USD @ $1.0/USD) attached to this issue has been approved & issued to @randomrandomnessanon.
Ah, indeed, I didn't know the work needed to be closed on GC as well.
On Wed, Oct 12, 2022 at 4:57 PM Gitcoin.co Bot @.***> wrote:
Issue Status: 1. Open 2. Started 3. Submitted 4. Done
The funding of 99.9 USD (100.00 USD @ $1.0/USD) attached to this issue has been approved & issued to @RandomRandomnessAnon https://github.com/RandomRandomnessAnon.
- Learn more on the Gitcoin Issue Details page https://gitcoin.co/issue/29424
- Questions? Checkout Gitcoin Help https://gitcoin.co/help or the Gitcoin's Discord https://discord.gg/gitcoin/
- $1,003,357,228.27 more funded OSS Work available on the Gitcoin Issue Explorer https://gitcoin.co/explorer
— Reply to this email directly, view it on GitHub https://github.com/SecureSECO/SearchSECOController/issues/19#issuecomment-1276318603, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAERHLHLJFREGYQD7FROV5LWC3GVXANCNFSM567SFZWA . You are receiving this because you were mentioned.Message ID: @.***>
-- dr. Slinger Jansen (Roijackers) Do you want to secure software ecosystems https://secureseco.org/ with us? Software Production Research Group https://www.uu.nl/en/research/software-systems/organization-and-information , Utrecht University http://www.slingerjansen.nl +31 6 19 884 880 book me through YouCanBook.me http://slingerroijackers.YouCanBook.me
Issue Status: 1. Open 2. Started 3. Submitted 4. Done
The funding of 100.14 USD (100.00 USD @ $1.0/USD) attached to this issue has been cancelled by the bounty submitter
Good one! Fixed by cancelling bounty.
On Wed, Oct 12, 2022 at 8:52 PM abebeos @.***> wrote:
Ah, indeed, I didn't know the work needed to be closed on GC as well.
Nah, most possibly a gitcoin problem. Another problem:
A duplicate of the issue, still open: https://gitcoin.co/issue/29252
— Reply to this email directly, view it on GitHub https://github.com/SecureSECO/SearchSECOController/issues/19#issuecomment-1276598150, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAERHLB2FNJTYEDIHH5IXP3WC4CGZANCNFSM567SFZWA . You are receiving this because you were mentioned.Message ID: @.***>
-- dr. Slinger Jansen (Roijackers) Do you want to secure software ecosystems https://secureseco.org/ with us? Software Production Research Group https://www.uu.nl/en/research/software-systems/organization-and-information , Utrecht University http://www.slingerjansen.nl +31 6 19 884 880 book me through YouCanBook.me http://slingerroijackers.YouCanBook.me
Especially the number of methods uploaded in the final call is weird.