NebulousLabs / Sia

Blockchain-based marketplace for file storage. Project has moved to GitLab: https://gitlab.com/NebulousLabs/Sia
https://sia.tech
MIT License
2.71k stars 442 forks source link

Unable to upload files with renter #3076

Open memory-thrasher opened 6 years ago

memory-thrasher commented 6 years ago

BUG REPORT

Stack Trace or error message $ curl -A 'Sia-Agent' http://127.0.0.1:9980/renter/files {"files":[{"siapath":"bar3.txt","localpath":"C:\test","filesize":10,"available":false,"renewing":true,"redundancy":0,"uploadedbytes":0,"uploadprogress":0,"expiration":0} ... (a few other attempts with the same issue)

Expected Behavior File should upload, and be reported as uploaded and available by retner/files api call.

How to reproduce it (as minimally and precisely as possible) (I use cygwin here for curl, so please excuse the backslash poisoning.) echo "something" >C:\\test curl -A 'Sia-Agent' http://127.0.0.1:9980/renter/upload/bar3.txt?source=C:\\test -X POST sleep 24h curl -A 'Sia-Agent' http://127.0.0.1:9980/renter/files observe uploadprogress and uploadedbytes are 0

Environment

Notes First noticed on released binaries from github, but reproduced with current upstream (v1.3.3 Git Revision f9fe3ec), cross-compiled from ubuntu 16.04.1 VM. I picked a few contracts at random and they all seem to be running v1.3.2 as well, according to siahub. Similar to issue #2222 but the fix given there did not help me. I know siad can read (or at least stat) the file because it gets the size correct (10 bytes) in /files api call.

Daemon reports 58 contracts with 54 distinct hosts: $ curl -sA 'Sia-Agent' http://127.0.0.1:9980/renter/contracts | grep -o '"netaddress":"[^"]"' | cut -d\" -f4 | sort -u | wc -l 54 $ curl -sA 'Sia-Agent' http://127.0.0.1:9980/renter/contracts | grep -o '"netaddress":"[^"]"' | cut -d\" -f4 | sort | wc -l 58

Consensus: (siastats.info agrees with the current block height) $ curl -A 'Sia-Agent' http://127.0.0.1:9980/consensus; date -u {"synced":true,"height":157313,"currentblock":"000000000000000067ec819b02458bc43b80793f9e508281e944764bdceda1c3","target":[0,0,0,0,0,0,0,0,116,71,54,49,6,139,12,223,51,201,199,253,254,248,193,148,26,250,49,32,158,115,143,190],"difficulty":"40612665787593496686"} Sun, Jun 03, 2018 2:56:18 AM

Renter settings: $ curl -A 'Sia-Agent' http://127.0.0.1:9980/renter {"settings":{"allowance":{"funds":"800000000000000000000000000","hosts":50,"period":25920,"renewwindow":1008},"maxuploadspeed":0,"maxdownloadspeed":0,"streamcachesize":2},"financialmetrics":{"contractfees":"108565264558212966848060000","downloadspending":"0","storagespending":"0","totalallocated":"316709631111179363623719979","uploadspending":"0","unspent":"691434735441787033151940000","contractspending":"295376297777846030290386647"},"currentperiod":157152}

11 distinct peers: $ curl -A 'Sia-Agent' http://127.0.0.1:9980/gateway {"netaddress":"68.231.191.159:9981","peers":[{"inbound":false,"local":false,"netaddress":"142.161.29.154:9981","version":"1.3.2"},{"inbound":false,"local":false,"netaddress":"144.217.7.188:9981","version":"1.3.1"},{"inbound":false,"local":false,"netaddress":"141.2.114.38:9981","version":"1.3.2"},{"inbound":false,"local":false,"netaddress":"97.90.42.96:9955","version":"1.3.3"},{"inbound":false,"local":false,"netaddress":"217.71.230.201:9981","version":"1.3.2"},{"inbound":true,"local":false,"netaddress":"110.82.111.87:9981","version":"1.3.2"},{"inbound":false,"local":false,"netaddress":"24.4.189.106:9981","version":"1.3.1"},{"inbound":false,"local":false,"netaddress":"163.172.215.185:9981","version":"1.3.2"},{"inbound":true,"local":false,"netaddress":"73.78.208.37:9981","version":"1.3.2"},{"inbound":false,"local":false,"netaddress":"68.195.90.26:9981","version":"1.3.1"},{"inbound":false,"local":false,"netaddress":"79.188.159.193:9981","version":"1.3.2"}]}

Renter.log: 2018/06/03 01:54:50.102326 uploadheap.go:302: Repairing 4 chunks This line repeats many times with different timestamps, with nothing else of note in the log.

[Edited for a few typos]

MSevey commented 6 years ago

@memory-thrasher if you run siac renter uploads does it show the file as uploading still?

Also have you checked your wallet's confirmed siacoin balance is no zero?

memory-thrasher commented 6 years ago

@MSevey Yes and yes. The other files listed are different attempts of the same file with different combinations of syntax and file source location.

siac renter uploads Uploading 6 files: 10 B wafflecattest/bar2.txt (uploading, 0.00%) 10 B bar3.txt (uploading, 0.00%) 10 B bar.txt (uploading, 0.00%) 10 B bar.txt (uploading, 0.00%) 10 B bar2.txt (uploading, 0.00%) 10 B wafflecattest/bar.txt (uploading, 0.00%)

siac wallet Wallet status: Encrypted, Unlocked Height: 157578 Confirmed Balance: 6.524 KS Unconfirmed Delta: +0 H Exact: 6523742079166468725388328739 H Siafunds: 0 SF Siafund Claims: 0 H

Estimated Fee: 30 mS / KB

MSevey commented 6 years ago

@memory-thrasher are there any error messages in contractor.log or hostdb.log

memory-thrasher commented 6 years ago

@MSevey hostdb.log has only lines very similar to these three:

2018/06/05 14:47:52.776943 scan.go:449: Performing scan on 785 online hosts and 244 offline hosts. 2018/06/03 02:32:35.819536 hostdb.go:103: SHUTDOWN: Logging has terminated. 2018/06/03 02:33:46.567520 hostdb.go:97: STARTUP: Logging has started. Siad Version 1.3.3

contractor.log has various errors from previous attempts, none appear relevant to the current state, but I am listing the most recent of each (note the timestamps):

2018/06/02 13:57:05.260786 contracts.go:580: Attempted to form a contract with 82.220.99.82:9982, but negotiation failed: not enough money to pay both siafund fee and also host payout

[I fixed this one by increasing renter allowance (from ~4sc to ~800sc)]

2018/06/01 21:30:21.325341 allowance.go:63: INFO: setting allowance to {800000000000000000000000000 50 25920 1008} 2018/06/01 22:01:41.874140 contracts.go:479: WARN: failed to renew contract 6257d0965db9fcde52..[redacted]: host did not accept our proposed contract: internal error: could not add collateral: wallet has coins spent in incomplete transactions - not enough remaining coins 2018/06/01 23:00:55.303369 contracts.go:479: WARN: failed to renew contract 6257d0965db9fcde525c...[redacted]: host did not accept revision request: EOF 2018/06/02 13:57:06.312420 contracts.go:178: Formed contract with nitsllc.ddns.net:9982 for 1.382 SC 2018/06/04 16:31:38.714885 contracts.go:538: WARN: failed to renew 6257d0965db9fcde525cd402174db58d20a2f915903db5916da50e8af7b0a31d, marked as bad: [wallet must be unlocked before it can be used]

[I unlocked the wallet]

2018/06/04 16:53:44.567247 contracts.go:454: renewing 4 contracts 2018/06/04 16:53:56.006699 contracts.go:550: Renewed contract 8dc5ee88c1192e635f...[redacted]

The "failed to renew...marked as bad" has me worried. They started immediately after (re)starting the daemon, so is it possible that the contracts are being marked as bad before I have a chance to unlock the wallet? If so, shouldn't they be cancelled and replaced with new contracts, or else reattempted after the wallet is unlocked?

Note that these logs are the ones generated from my cross-compile of the master branch with ''make release'', as I am led to believe they are more verbose.

ChrisSchinnerl commented 6 years ago

@memory-thrasher there is a chance that your contracts are marked as bad due to your wallet not being unlocked during the renew window. That's actually something we are actively working on right now in #3080.

memory-thrasher commented 6 years ago

Alright, I'm subscribed to that issue now and I'll give it a shot once it's merged to master.

I have a few concerns though: The issue is labeled "Only replace unrenewable contract if it's the host's fault", but in this case you are saying that the contract becoming unrenewable is the renter's fault (for not unlocking the wallet); so what happens to contracts when the renter is at fault?

Secondly, since it appears I am at fault, it would be nice to have some kind of error message telling me I have to unlock the wallet, perhaps in the ''siac renter uploads'' cmd, or at least a note in the API docs.

Thirdly, what do I do now and what happens to the sc I spent on these "bad" contracts? In my understanding the merge you mentioned will at most keep future contracts from being marked as bad in this way, but how does that fix the contracts that are already marked as bad? Is there a way to abandon the existing contracts so they can be replaced, and if so how and what repercussions would that incur?

Finally, if having a locked wallet immediately upon starting the daemon voids existing contracts, and if that is the renter's fault, does sia not support renting with an encrypted private key, or is restarting the daemon to apply updates seen as unreasonable?

memory-thrasher commented 6 years ago

@ChrisSchinnerl @MSevey Still stuck at 0.

F:\swl\Sia-v1.3.2-windows-amd64>siac wallet Wallet status: Encrypted, Unlocked Height: 157709 Confirmed Balance: 6.524 KS Unconfirmed Delta: +0 H Exact: 6523742079166468725388328739 H Siafunds: 0 SF Siafund Claims: 0 H Estimated Fee: 30 mS / KB F:\swl\Sia-v1.3.2-windows-amd64>siac renter uploads Uploading 6 files: 10 B wafflecattest/bar2.txt (uploading, 0.00%) 10 B bar.txt (uploading, 0.00%) 10 B bar.txt (uploading, 0.00%) 10 B bar2.txt (uploading, 0.00%) 10 B bar3.txt (uploading, 0.00%) 10 B wafflecattest/bar.txt (uploading, 0.00%) F:\swl\Sia-v1.3.2-windows-amd64>siac version Sia Client Version 1.3.2 Git Revision 72938f5 Build Time Tue Mar 13 19:48:47 EDT 2018 Sia Daemon Version 1.3.3 Git Revision da17f8f Build Time Tue Jun 5 11:47:29 MST 2018

New log entries since restart to apply merge da17f8f: Renter:

2018/06/05 18:52:38.921550 persist.go:439: STARTUP: Logging has started. Siad Version 1.3.3 2018/06/05 18:52:39.540129 uploadheap.go:312: Repairing 6 chunks

hostdb:

2018/06/05 18:52:38.237464 hostdb.go:97: STARTUP: Logging has started. Siad Version 1.3.3 2018/06/05 18:53:07.501180 scan.go:449: Performing scan on 796 online hosts and 234 offline hosts.

contractor

2018/06/05 18:52:38.661017 contractor.go:217: STARTUP: Logging has started. Siad Version 1.3.3

ChrisSchinnerl commented 6 years ago

When the renter is at fault the contracts won't be marked as bad and therefore they won't be replaced. That means having a locked wallet won't void your contracts anymore. Once the contracts are marked as bad they will simply expire and the remaining funds will be returned to the user. The renter will try to form a new set of contracts to replace the expired ones.

memory-thrasher commented 6 years ago

Ok, so I will eventually get my sc back, when the contracts expire. Is there any way for me to use renter effectively in the mean time, before they expire? siac renter uploads still says 0% uploading. I was not expecting this kind of bug so I selected a long-term contract period, ending in block 183072, which by my math is about 7 months. Would I still get my sc back if I purge the contracts from the renter dir?

ChrisSchinnerl commented 6 years ago

Yes you would still get your money back. For the future I would recommend shorter periods. Especially on RCs since they contain new features (like the contract replacement for failed renewals) which might be unstable. We never merge any code without tests but it seems like this time we missed an edge case.

memory-thrasher commented 6 years ago

@ChrisSchinnerl Alright, so I stopped siad moved out the contracts folder in datadir/renter to another location, started siad, set a new period and renewal window (5200/1600), unlocked and waited. I now have contracts with 50 distinct hosts. I submitted the api request to try to upload again and it is still stuck at 0%.

renter.log:

2018/06/05 23:00:13.229808 persist.go:439: STARTUP: Logging has started. Siad Version 1.3.3 2018/06/05 23:00:13.777878 uploadheap.go:312: Repairing 0 chunks 2018/06/05 23:15:13.778663 uploadheap.go:312: Repairing 0 chunks 2018/06/05 23:30:13.782449 uploadheap.go:312: Repairing 7 chunks 2018/06/05 23:45:13.786235 uploadheap.go:312: Repairing 0 chunks 2018/06/06 00:00:13.789521 uploadheap.go:312: Repairing 5 chunks 2018/06/06 00:15:13.792807 uploadheap.go:312: Repairing 7 chunks 2018/06/06 00:29:52.377373 uploadheap.go:312: Repairing 1 chunks

hostdb.log:

2018/06/05 23:00:12.713743 hostdb.go:97: STARTUP: Logging has started. Siad Version 1.3.3 2018/06/05 23:00:34.998572 scan.go:449: Performing scan on 792 online hosts and 239 offline hosts.

contractor.log:

2018/06/05 23:00:12.986277 contractor.go:217: STARTUP: Logging has started. Siad Version 1.3.3 2018/06/05 23:25:49.506890 contracts.go:655: Attempted to form a contract with siaberry.mooo.com:9984, but negotiation failed: [host has returned an error; host did not accept our proposed contract: internal error: host has reached its collateral budget and cannot accept the file contract] 2018/06/05 23:25:49.983451 contracts.go:655: Attempted to form a contract with sia.xertez.org:9982, but negotiation failed: not enough money to pay both siafund fee and also host payout 2018/06/05 23:25:52.320248 contracts.go:198: Formed contract 13810c6df4e9d050757a988dfde1cc04b12f4495766090ecd3503b24f992a527 with 188.244.40.69:10004 for 3.733 SC 2018/06/05 23:25:55.708678 contracts.go:198: Formed contract 846929146641b8e48bfbcebd4058e0fa6ea4dad4f8c9d4f0b80eb4aa6b7ffccf with sia.acbilibilifun.cc:9982 for 3.991 SC 2018/06/05 23:26:02.791577 contracts.go:198: Formed contract adfd1eff992835469c6c85b127a8214ee975a4566ddb9cf33f1f5eaa98db2887 with bigj816.ddns.net:9982 for 4.378 SC 2018/06/05 23:26:04.460789 contracts.go:655: Attempted to form a contract with 50.78.254.209:9982, but negotiation failed: not enough money to pay both siafund fee and also host payout 2018/06/05 23:26:05.046364 contracts.go:655: Attempted to form a contract with sia1.utahminingpools.com:9982, but negotiation failed: [host has returned an error; host did not accept our proposed contract: internal error: host has reached its collateral budget and cannot accept the file contract] 2018/06/05 23:26:07.933230 contracts.go:198: Formed contract 4e4492261aedd9e7cb51ad047d8f9eb840cefccf64cc0051cbd1c233f81daa2f with 88.206.14.131:9982 for 3.991 SC 2018/06/05 23:26:11.550690 contracts.go:198: Formed contract 2528de0cfd43a41699a68df081b27eefbda8ef3f276d954e1e51944f1c343342 with 01siahostnl030.ddns.net:9982 for 3.577 SC 2018/06/05 23:26:17.245913 contracts.go:198: Formed contract 95d19b8bc1feab01c27972ba620a94beb5d349a3b6a8aab0309bc706372a4e2b with sanasol.sknt.ru:9985 for 3.797 SC 2018/06/05 23:26:29.965028 contracts.go:655: Attempted to form a contract with citberry.anydns.info:9982, but negotiation failed: not enough money to pay both siafund fee and also host payout 2018/06/05 23:26:32.963909 contracts.go:198: Formed contract 5e8b707fb910baf6ccbacfef73c80311cfefd808c1aa8a7f64d69c467e35a5f5 with 134.249.162.219:9982 for 2.827 SC 2018/06/05 23:26:36.135312 contracts.go:655: Attempted to form a contract with sia.octivodigital.com:9982, but negotiation failed: [host has returned an error; host did not accept our proposed contract: internal error: host has reached its collateral budget and cannot accept the file contract] 2018/06/05 23:26:37.461480 contracts.go:655: Attempted to form a contract with 62.216.63.179:9982, but negotiation failed: [host has returned an error; host did not accept our proposed contract: internal error: host has reached its collateral budget and cannot accept the file contract] 2018/06/05 23:26:39.304214 contracts.go:655: Attempted to form a contract with 101.100.158.73:9982, but negotiation failed: [host has returned an error; host did not accept our proposed contract: internal error: could not add collateral: insufficient balance] 2018/06/05 23:26:39.962798 contracts.go:655: Attempted to form a contract with 82.220.99.82:9982, but negotiation failed: not enough money to pay both siafund fee and also host payout 2018/06/05 23:26:41.518495 contracts.go:655: Attempted to form a contract with elysium.thepilchardworks.win:9982, but negotiation failed: [host has returned an error; host did not accept our proposed contract: internal error: could not add collateral: insufficient balance] 2018/06/05 23:26:42.530624 contracts.go:655: Attempted to form a contract with 162.229.184.124:9982, but negotiation failed: [host has returned an error; host did not accept our proposed contract: internal error: host has reached its collateral budget and cannot accept the file contract] 2018/06/05 23:26:46.825669 contracts.go:198: Formed contract fe3624d111a25ab3cafc01fc6b19789d288f907da2dedf54dc2820f4e5ce29c6 with 82.231.193.206:9982 for 3.571 SC 2018/06/05 23:26:49.532513 contracts.go:655: Attempted to form a contract with 109.172.42.157:9982, but negotiation failed: [host has returned an error; host did not accept our proposed contract: internal error: host has reached its collateral budget and cannot accept the file contract] 2018/06/05 23:26:51.983324 contracts.go:198: Formed contract 3bc138a1db08a76904f0dc03d4a3a24e03451c6b6e0c7fde0d0745b4c757fe6a with 212.77.177.47:9982 for 1.845 SC 2018/06/05 23:26:54.752676 contracts.go:655: Attempted to form a contract with telariust.ddns.net:9982, but negotiation failed: [host has returned an error; host did not accept our proposed contract: internal error: host has reached its collateral budget and cannot accept the file contract] 2018/06/05 23:26:56.282870 contracts.go:655: Attempted to form a contract with 5.45.122.117:9982, but negotiation failed: [host has returned an error; host did not accept our proposed contract: internal error: could not add collateral: insufficient balance] 2018/06/05 23:26:59.329757 contracts.go:198: Formed contract 5f268b0424e8be5fcbad37d14b49ab0b0dbb414be6b274c77cd28a3e4c9b4d56 with 85.69.198.249:9982 for 4.443 SC 2018/06/05 23:27:06.742198 contracts.go:198: Formed contract 167549933fd2ea95843403c31d69c6b738ddfd710590f70ecf89fa6c77ffb5dc with 121.186.160.71:9982 for 4.396 SC 2018/06/05 23:27:12.694954 contracts.go:198: Formed contract 238921e8d20fc40cd8c91fccf50ed8f051d7c2dd0b3dab52384019dda47f605a with 81.217.214.7:9982 for 3.025 SC 2018/06/05 23:27:13.883105 contracts.go:655: Attempted to form a contract with 97.90.10.186:9982, but negotiation failed: not enough money to pay both siafund fee and also host payout 2018/06/05 23:27:20.946002 contracts.go:198: Formed contract 542162cc57bcdce0d36053c58b081b47a8845b10993310535832199d2f5190c2 with sia1.justmike.at:9982 for 3.709 SC 2018/06/05 23:27:25.797618 contracts.go:198: Formed contract 5ba37f4a5ee237c9b8f42962433cea650806649c6a7012402f97cc701974358a with sia.net10.space:9982 for 4.282 SC 2018/06/05 23:27:27.137788 contracts.go:655: Attempted to form a contract with 78.29.234.198:9982, but negotiation failed: [host has returned an error; host did not accept our proposed contract: internal error: could not add collateral: insufficient balance] 2018/06/05 23:27:27.482832 contracts.go:655: Attempted to form a contract with shaw.majestik.org:9982, but negotiation failed: not enough money to pay both siafund fee and also host payout 2018/06/05 23:27:30.730744 contracts.go:198: Formed contract 56633098596e2fb7599cb266701a9b930e9d428678aeb0b2c6479ed1089066a7 with 81.200.59.110:9982 for 4.468 SC 2018/06/05 23:27:33.501596 contracts.go:655: Attempted to form a contract with sia.good4it.com:9982, but negotiation failed: [host has returned an error; host did not accept our proposed contract: internal error: host has reached its collateral budget and cannot accept the file contract] 2018/06/05 23:27:35.690374 contracts.go:198: Formed contract de997e453948ca4c7346d38f6195ee0e572ebc773317bbea94b00a088eecf286 with sia.coinnodehub.com:9982 for 3.991 SC 2018/06/05 23:27:38.131184 contracts.go:198: Formed contract 621c221f40e4c05272731b3b0ad30253a14e6c727f9b8f78aa01345cfb859f8b with siabox.ignorelist.com:9982 for 4.564 SC 2018/06/05 23:27:43.132819 contracts.go:198: Formed contract 7cefaa14cea4116f9cf784b634ba2b279a37e76ccabcc34bd813416b8296e723 with 176.240.111.223:9982 for 4.559 SC 2018/06/05 23:27:44.595505 contracts.go:655: Attempted to form a contract with 86.61.77.10:9982, but negotiation failed: [host has returned an error; host did not accept our proposed contract: internal error: could not add collateral: insufficient balance] 2018/06/05 23:27:44.674015 contracts.go:655: Attempted to form a contract with 96.95.217.5:9982, but negotiation failed: not enough money to pay both siafund fee and also host payout 2018/06/05 23:27:47.201836 contracts.go:655: Attempted to form a contract with plumbus-1.asuscomm.com:9982, but negotiation failed: [host has returned an error; host did not accept our proposed contract: internal error: host has reached its collateral budget and cannot accept the file contract] 2018/06/05 23:27:48.798039 contracts.go:198: Formed contract 797c415a3061ee2a2ad2679221f22e7b235b66d194bf85b6c0df72c39065d257 with 83.81.98.41:9982 for 4.212 SC 2018/06/05 23:27:51.702407 contracts.go:198: Formed contract b04dc63b9219c4d5895969b781f13603fb18fcdc959d88b68eaed66839036908 with rotor.chickenkiller.com:9982 for 4.378 SC 2018/06/05 23:27:54.380247 contracts.go:198: Formed contract e7eb7673b341fed857c64ca303b5775ee9913994617c0ac314f1b4ee18b51e4a with sia.pyraster.fr:9982 for 1.328 SC 2018/06/05 23:27:57.693668 contracts.go:198: Formed contract ee7922cbdbe191914f49430536a492412e0fb37410e794df8f8310b91b672dae with 108.214.204.207:9982 for 4.378 SC 2018/06/05 23:27:59.319375 contracts.go:198: Formed contract eaadf6355f5febe8d9c37000d1dc060f46de0496bdf63b304c358d21893e701e with ocha1.myfirewall.co:9982 for 4.282 SC 2018/06/05 23:28:03.418395 contracts.go:198: Formed contract 68a7eb732e367c8989e7104fbfc981ef11a07adbf51baddf4d40c88db11f6782 with sanasol.sknt.ru:9982 for 3.797 SC 2018/06/05 23:28:05.705185 contracts.go:198: Formed contract ea0647d1078111b5f259cd0ef5dbe6298e3ec4cc6e5100327d318eea968b1641 with sia.bryhardt.com:9982 for 3.067 SC 2018/06/05 23:28:07.385399 contracts.go:655: Attempted to form a contract with sia.tutemwesi.com:9982, but negotiation failed: [host has returned an error; host did not accept our proposed contract: internal error: host has reached its collateral budget and cannot accept the file contract] 2018/06/05 23:28:11.251890 contracts.go:198: Formed contract 779ddc64c50bf62ac55dea04dcdd293c529f59cc21333913f3610de331c11084 with mjoelkbar-hem.asuscomm.com:9985 for 3.4 SC 2018/06/05 23:28:19.194398 contracts.go:198: Formed contract 1fdfe0623960d1af2765fd563a26dbf5058db35c202c8a2163cdaffeedc190bb with goldfinger.spdns.eu:9982 for 1.18 SC 2018/06/05 23:28:32.625604 contracts.go:198: Formed contract 24e65fc11f759c6d05311beca7cb9ab302aa54c24f51a7e64d0a1a8a1ae707f4 with 174.117.123.25:9982 for 3.726 SC 2018/06/05 23:28:42.823399 contracts.go:198: Formed contract 939528ab62eee2978b39732e75b243e94c3c920880624ab54f16d097ceb70a81 with 154.58.5.124:9982 for 4.443 SC 2018/06/05 23:28:46.776401 contracts.go:198: Formed contract a41d649bfe2b5a6f7670a6a5521d2ca7048964c4dfe71f43768b0b077b57af13 with 87.237.42.180:9982 for 4.378 SC 2018/06/05 23:28:47.989055 contracts.go:655: Attempted to form a contract with seriemotion.com:9982, but negotiation failed: [host has returned an error; host did not accept our proposed contract: internal error: host has reached its collateral budget and cannot accept the file contract] 2018/06/05 23:28:48.029560 contracts.go:655: Attempted to form a contract with 163.172.57.201:9982, but negotiation failed: not enough money to pay both siafund fee and also host payout 2018/06/05 23:28:49.186707 contracts.go:198: Formed contract 7f4e5b4d526211a15ad321626ffc00bb5f32c7a004a112ef8bc21f3e0b80cf5a with 74.131.196.214:9982 for 3.68 SC 2018/06/05 23:28:50.693398 contracts.go:198: Formed contract a66a4f267b72e5c6c6ec80ab6e3ac8e6039dbf142e58b5db9391a8dd0444dfd0 with 46.59.26.233:9982 for 3.797 SC 2018/06/05 23:28:56.042577 contracts.go:198: Formed contract db742a3d4a066b4a1d72f3028899abb59b45e26f6e6aa46ffa05b02f53ac5307 with btc.ln.dyndns.0xcb0.com:9982 for 4.335 SC 2018/06/05 23:28:57.247230 contracts.go:655: Attempted to form a contract with aussia.ddns.net:9982, but negotiation failed: [host has returned an error; host did not accept our proposed contract: internal error: host has reached its collateral budget and cannot accept the file contract] 2018/06/05 23:28:59.149472 contracts.go:198: Formed contract 46bf5d4e79309dd700cf7322cc5a5e5042643a857bf6cd6f7e26c39a3dfde504 with 88.190.16.23:9982 for 1.567 SC 2018/06/05 23:29:00.825185 contracts.go:198: Formed contract ace68e6599114b6739f05f7b51c37645af0df19eb87b723ce5566ecab2d623e2 with sia.techandsupply.ca:9982 for 3.215 SC 2018/06/05 23:29:03.847069 contracts.go:198: Formed contract eee5fe3b7fcd1e45fc15bf28b7ad961e968df2244b4d48fe36f7c9198e91b197 with awesomehost.chickenkiller.com:9982 for 3.293 SC 2018/06/05 23:29:05.571788 contracts.go:198: Formed contract 29549ae0dae67cc05e65fca1260a53bd0e96766b645c2d1ce238df00d04fd5b1 with 73.202.32.88:9982 for 3.65 SC 2018/06/05 23:29:06.766439 contracts.go:655: Attempted to form a contract with 84.30.178.110:9982, but negotiation failed: [host has returned an error; host did not accept our proposed contract: internal error: could not add collateral: insufficient balance] 2018/06/05 23:29:08.864706 contracts.go:198: Formed contract f238f0d2c0e3d30cb944f0993627c392aeb412ff35c9e269ed38ad6be5e3d39f with mobims.ru:9982 for 1.889 SC 2018/06/05 23:29:10.488912 contracts.go:655: Attempted to form a contract with 62.210.92.191:9982, but negotiation failed: [host has returned an error; host did not accept our proposed contract: internal error: could not add collateral: insufficient balance] 2018/06/05 23:29:12.308143 contracts.go:655: Attempted to form a contract with n0tursf00k0ff.ddns.net:9982, but negotiation failed: [host has returned an error; host did not accept our proposed contract: internal error: host has reached its collateral budget and cannot accept the file contract] 2018/06/05 23:29:15.408037 contracts.go:198: Formed contract a01369418d8ecb5f1f0745aaa0bae5d5e0360d89b1130c8450b00ebe4c97c153 with siaberry2.strangled.net:9972 for 3.663 SC 2018/06/05 23:29:20.301158 contracts.go:198: Formed contract 2e7b58e2c0c3c239111b01fadea090bff511c81136d735f68a93a77b33a94dc9 with 97.79.179.26:9982 for 3.991 SC 2018/06/05 23:29:25.007756 contracts.go:198: Formed contract 039666d188eb77aa701e342f114971c590c84b5594d59f6cac938f26281773f2 with 203.86.202.165:9982 for 3.889 SC 2018/06/05 23:29:56.327733 contracts.go:198: Formed contract 046e8a6d98586a5d7985037e833216064a9df3b70486a1a36a17f14943f30b2b with 69.117.49.104:9982 for 4.378 SC 2018/06/05 23:29:57.781917 contracts.go:198: Formed contract 9cd22387f04757824e477af4251adb2d4b3be5547b2abc3f7a542486cf729b97 with 195.222.134.76:9982 for 4.378 SC 2018/06/05 23:29:59.882684 contracts.go:198: Formed contract 385ca063677bcfec33c6a96a31260ba1388bbbcbfde2bfafa70f138e75dd6294 with home.randloki.com:9982 for 3.566 SC 2018/06/05 23:30:01.922943 contracts.go:198: Formed contract 014d59a529a4192abcddf2ac52ce3a00a4c0e8f73ec1d0d73db00ec29c944b7c with siajpn.ddns.net:9982 for 4.475 SC 2018/06/05 23:30:02.095465 contracts.go:655: Attempted to form a contract with siacoinhost.ddns.net:9982, but negotiation failed: not enough money to pay both siafund fee and also host payout 2018/06/05 23:30:04.008708 contracts.go:198: Formed contract 8d9ccbb8a6e682f3ad182b1a596b296682016297974956cbfbc3918c91d78380 with siaberry2.randloki.com:9972 for 3.566 SC 2018/06/05 23:30:09.499905 contracts.go:198: Formed contract 33c48151f6691e3d2989ce7a48191060bf262584ffe1caaa7265cbc156cdc06e with 71.226.76.64:9982 for 4.439 SC 2018/06/05 23:30:10.404520 contracts.go:655: Attempted to form a contract with chinggisnet.asuscomm.com:9982, but negotiation failed: [host has returned an error; host did not accept our proposed contract: internal error: host has reached its collateral budget and cannot accept the file contract] 2018/06/05 23:30:12.177245 contracts.go:198: Formed contract 021ad68e4c4f699b908dece1d9fd2f088d79b7ea4227acaed7e83f1538fd34e2 with sia.webbah.com:9982 for 3.991 SC 2018/06/05 23:30:14.099489 contracts.go:198: Formed contract 16f3e9975aaa3348483b6d00f93a0c6a39ce7e1f0c11a8daaedd6a3368b5c222 with hypnohost.mooo.com:9982 for 3.65 SC 2018/06/05 23:30:15.031108 contracts.go:655: Attempted to form a contract with suwit.ddns.net:9982, but negotiation failed: [host has returned an error; host did not accept our proposed contract: internal error: could not add collateral: wallet must be unlocked before it can be used] 2018/06/05 23:30:16.901345 contracts.go:198: Formed contract 35158aeee4caddb33e46dfc904588806833972bec283fff368a4b700bff94512 with 109.248.206.13:9982 for 3.733 SC

Any more ideas?

memory-thrasher commented 6 years ago

I just briefly ran a debug build in the same env. contractor.log had nothing new. hostdb.log has much more data, but I'm pretty sure the list of hosts is not relevant because contracts are being formed. renter.log has a bunch of these:

2018/06/06 03:26:28.501410 workerupload.go:113: [DEBUG] Worker failed to upload via the editor: [host has returned an error; write tcp 10.16.1.200:55065->93.31.40.157:9982: i/o timeout] 2018/06/06 03:26:44.885490 workerupload.go:113: [DEBUG] Worker failed to upload via the editor: [host has returned an error; write tcp 10.16.1.200:55063->83.77.39.46:9982: i/o timeout] 2018/06/06 03:27:01.269571 workerupload.go:113: [DEBUG] Worker failed to upload via the editor: [host has returned an error; write tcp 10.16.1.200:55067->69.196.152.119:9982: i/o timeout] 2018/06/06 03:27:17.829674 workerupload.go:113: [DEBUG] Worker failed to upload via the editor: [host has returned an error; write tcp 10.16.1.200:55064->80.108.24.36:9982: i/o timeout]

In fact, every contract host has one such line. I verified this with the follow bash monster (I'm putting this here mostly so I can find it again, but if it is of use to anyone else, all the better):

diff -w <(curl -sA 'Sia-Agent' http://127.0.0.1:9980/renter/contracts | grep -o '"netaddress":"[^"]*"' | cut -d\" -f4 | cut -d: -f1 | sort -u | (while read line; do echo "$line" | egrep '[0-9]+(.[0-9]+){3}' || nslookup $line 2>&1 | grep Address | cut -d\ -f3; done;) | grep -vF 10.16.1 | sort -u) <(grep -F 'i/o timeout' renter.log | egrep -o -- '[0-9]+(.[0-9]+){3}' | grep -vF 10.16.1 | sort -u)

Which returned nothing, meaning every ip address listed in the contracts return is listed in the renter.log timeout error messages, and vice versa.

So, are these IO timeouts TCP or HDD? Unfortunately, the offending function (Upload at modules/renter/proto/editor.go:61) has a great many error returns, and your debug setup does not give a stack trace, not even in a debug build. My best guess is the return statement at line 148, following negotiateRevision (editor.go:154). I pinged a few random hosts and got an average round-trip delay of 570ms, which is a bit long but should not be completely inhibiting. Tomorrow I'll try a packet interception to see if we're getting delayed responses, replays, or resets, then maybe I'll add more debug statements to the function tree to trace it down. I'll keep this thread updated.

memory-thrasher commented 6 years ago

I got a packet capture between my renter and a host at 108.214.204.207. Github will not let me upload the pcap. I do not understand the binary communication, but I do see a few plain english words in the stream: Renter: ReviseContract Host: accept, ed25519, ed25519 (new keys?) Renter: accept, Insert

This is followed by almost 100k of renter->host, then the conversation just stops. There are a large number of retransmitted packets, and acks are grouped tightly together in groups of around 10, plus the data window is reduced several times, which would seem to indicate high latency and/or bandwidth limitation.

What is interesting is that there are no FIN or RST packets. Neither side attempts to close the connection, even after it has timed out. I waited for 5 minutes for the stream to end before stopping the capture, although the renter client specifies a timeout of 45 seconds. I am starting to think this is a bug in the tcp bindings in go. The investigation continues.

ChrisSchinnerl commented 6 years ago

Every host timed out? What does siac renter contracts tell you about your contracts? are they still good for uploading?

memory-thrasher commented 6 years ago

Yes. GoodForUpload is true for all entries. Sample 2 entries:

Contracts: Host Remaining Funds Spent Funds Spent Fees Da ta End Height ID GoodForUpload GoodForRenew siabox.ignorelist.com:9982 4.564 SC 0 H 769 mS 0 B 162352 621c221f40e4c05272731b3b0ad30253a14e6c727f9b8f78aa01345cfb859f8b true true 176.240.111.223:9982 4.559 SC 0 H 773.9 mS 0 B 162352 7cefaa14cea4116f9cf784b634ba2b279a37e76ccabcc34bd813416b8296e723 true true

I was just about to post the next volume in the saga:

Next I added line numbers to the err returns in renter/proto/editor.go, basically replacing each ''return nil, err" with "return nil, fmt.Errorf("%s, editor.go:", err)" to find out where the errors were actually coming from, and got mostly 207, 188, 137, 123.

Sample:

2018/06/06 17:54:23.689619 workerupload.go:113: Worker failed to upload via the editor: [host has returned an error; write tcp 10.16.1.200:1773->82.231.193.206:9982: i/o timeout, editor.go:123] 2018/06/06 17:54:56.479783 workerupload.go:113: Worker failed to upload via the editor: [host has returned an error; write tcp 10.16.1.200:1793->81.217.140.112:9982: i/o timeout, editor.go:123] 2018/06/06 17:57:56.718170 workerupload.go:113: Worker failed to upload via the editor: [host has returned an error; write tcp 10.16.1.200:1760->88.206.15.127:9982: i/o timeout, editor.go:137] 2018/06/06 17:58:13.102251 workerupload.go:113: Worker failed to upload via the editor: [host has returned an error; write tcp 10.16.1.200:1764->188.242.52.10:9985: i/o timeout, editor.go:137]

So I increased the relevant timeouts in modules/negotiate.go (NegotiateRecentRevisionTime, NegotiateFileContractTime, NegotiateSettingsTime) to 600 seconds, and now I am getting different write errors as well as a few timeouts still.

2018/06/06 18:19:28.776241 workerupload.go:103: Worker failed to acquire an editor: [host has returned an error; host did not accept revision request: EOF, editor.go:207, editor.go:188] 2018/06/06 18:19:28.840249 workerupload.go:103: Worker failed to acquire an editor: [host has returned an error; couldn't send challenge response: write tcp 10.16.1.200:3233->174.117.123.25:9982: wsasend: An established connection was aborted by the software in your host machine., editor.go:207, editor.go:188] 2018/06/06 18:19:28.904257 workerupload.go:103: Worker failed to acquire an editor: [host has returned an error; couldn't send challenge response: write tcp 10.16.1.200:3216->188.244.40.69:10004: wsasend: An established connection was aborted by the software in your host machine., editor.go:207, editor.go:188] 2018/06/06 18:19:28.968265 workerupload.go:103: Worker failed to acquire an editor: [host has returned an error; couldn't send challenge response: write tcp 10.16.1.200:3213->109.248.206.13:9982: wsasend: An established connection was aborted by the software in your host machine., editor.go:207, editor.go:188] 2018/06/06 18:19:29.032273 workerupload.go:103: Worker failed to acquire an editor: [host has returned an error; host did not accept revision request: EOF, editor.go:207, editor.go:188] 2018/06/06 18:19:29.096281 workerupload.go:103: Worker failed to acquire an editor: [host has returned an error; couldn't send challenge response: write tcp 10.16.1.200:3207->188.242.52.10:9985: wsasend: An established connection was aborted by the software in your host machine., editor.go:207, editor.go:188] 2018/06/06 18:19:29.160289 workerupload.go:103: Worker failed to acquire an editor: [host has returned an error; host did not accept revision request: EOF, editor.go:207, editor.go:188] 2018/06/06 18:19:45.608378 workerupload.go:103: Worker failed to acquire an editor: [host has returned an error; couldn't send challenge response: write tcp 10.16.1.200:3204->95.165.147.40:9982: wsasend: An established connection was aborted by the software in your host machine., editor.go:207, editor.go:188] 2018/06/06 18:21:23.992871 workerupload.go:113: Worker failed to upload via the editor: [host has returned an error; write tcp 10.16.1.200:3251->81.217.140.112:9982: i/o timeout, editor.go:137] 2018/06/06 18:21:40.376952 workerupload.go:113: Worker failed to upload via the editor: [host has returned an error; write tcp 10.16.1.200:3249->99.27.120.250:9982: i/o timeout, editor.go:137] 2018/06/06 18:21:56.760532 workerupload.go:113: Worker failed to upload via the editor: [host has returned an error; write tcp 10.16.1.200:3240->80.108.24.36:9982: i/o timeout, editor.go:137] 2018/06/06 18:22:13.144613 workerupload.go:113: Worker failed to upload via the editor: [host has returned an error; write tcp 10.16.1.200:3239->83.128.202.64:9982: i/o timeout, editor.go:137] 2018/06/06 18:23:02.296354 workerupload.go:113: Worker failed to upload via the editor: [host has returned an error; write tcp 10.16.1.200:3241->115.192.116.200:9982: i/o timeout, editor.go:137] 2018/06/06 18:23:18.680935 workerupload.go:113: Worker failed to upload via the editor: [host has returned an error; write tcp 10.16.1.200:3238->47.40.180.230:9982: i/o timeout, editor.go:137] 2018/06/06 18:23:35.064015 workerupload.go:113: Worker failed to upload via the editor: [host has returned an error; write tcp 10.16.1.200:3246->83.77.39.46:9982: i/o timeout, editor.go:137] 2018/06/06 18:23:51.456097 workerupload.go:113: Worker failed to upload via the editor: [host has returned an error; write tcp 10.16.1.200:3247->136.32.148.119:9982: i/o timeout, editor.go:137] 2018/06/06 18:24:07.840177 workerupload.go:103: Worker failed to acquire an editor: [host has returned an error; couldn't send contract ID: write tcp 10.16.1.200:4344->81.200.59.110:9982: i/o timeout, editor.go:207, editor.go:188] 2018/06/06 18:24:07.848178 workerupload.go:113: Worker failed to upload via the editor: [host has returned an error; write tcp 10.16.1.200:3244->136.30.34.59:9982: i/o timeout, editor.go:137] 2018/06/06 18:24:24.232259 workerupload.go:113: Worker failed to upload via the editor: [host has returned an error; write tcp 10.16.1.200:3252->69.245.6.238:9982: i/o timeout, editor.go:137] 2018/06/06 18:24:40.616839 workerupload.go:113: Worker failed to upload via the editor: [host has returned an error; write tcp 10.16.1.200:3245->126.150.92.41:9982: i/o timeout, editor.go:137] 2018/06/06 18:24:57.000420 workerupload.go:113: Worker failed to upload via the editor: [host has returned an error; write tcp 10.16.1.200:3243->79.113.78.101:9982: i/o timeout, editor.go:137] 2018/06/06 18:25:13.384000 workerupload.go:113: Worker failed to upload via the editor: [host has returned an error; write tcp 10.16.1.200:3248->146.52.109.85:9982: i/o timeout, editor.go:137] 2018/06/06 18:25:15.668290 uploadheap.go:312: Repairing 0 chunks 2018/06/06 18:25:29.800585 workerupload.go:113: Worker failed to upload via the editor: [host has returned an error; write tcp 10.16.1.200:3250->84.219.196.39:9985: i/o timeout, editor.go:137] 2018/06/06 18:25:46.184666 workerupload.go:113: Worker failed to upload via the editor: [host has returned an error; write tcp 10.16.1.200:3253->94.11.108.89:9982: i/o timeout, editor.go:137] 2018/06/06 18:26:02.756770 workerupload.go:103: Worker failed to acquire an editor: [host has returned an error; couldn't send contract ID: write tcp 10.16.1.200:4458->47.40.180.230:9972: wsasend: An established connection was aborted by the software in your host machine., editor.go:207, editor.go:188]

I guess 10 minutes is more then win7 allows a connection to be idle. I pinged a few of the hosts and latency is around 300-1000ms, so they really shouldn't be timing out. The investigation continues.

memory-thrasher commented 6 years ago

P.S. The ongoing packet capture is still not reporting reset or finish packets (except for one host, 68.231.191.159, which resets in tight groups of 3 every ~15 minutes), even though all these connections are apparently timing out and failing. Tcp conformance (or the lack there of) appears to be handled well below your go code. Again I reluctantly lean towards an issue in the go tcp windows bindings. What we are seeing should not be possible, let alone consistent.

memory-thrasher commented 6 years ago

I just got a bunch of FIN,ACK packets on the interceptor. Looks like ~30 of the 50 hosts all gracefully terminated their connections all at once (within 60 seconds). Still 0% upload progress. I picked one of the FINs at random and examined the stream:

renter: ReviseContract host: accept ed25519 ed25519

<5 minute delay> renter: 0e 00 00 00 00 00 00 00 <5 minute delay> host: FIN,ACK renter: ACK

Edit: Forgot to mention that the only log entries for that host are still i/o timeouts. I am currently moving the sia data dir to a ubuntu vm to see if linux has this problem also.

memory-thrasher commented 6 years ago

I copied the datadir to and compiled the same git revision da17f8f4 on an ubuntu vm (without my changes), unlocked the wallet, and added a new upload. All uploads are still stalled at 0. Nothing of interest in the logs.

I'm running out of ideas. Has anyone on your end got a renter upload to work with this git build?

memory-thrasher commented 6 years ago

I have tested the new release bins (v1.3.3) and the problem persists on win64 and linux64.

I am currently attempting to build an isolated test environment. Currently I have 5 nodes with versions distributed between the last few releases (v1.3.0-v1.3.3). Most nodes are (intentionally) "synced" at blockheight 0, with one doing CPU mining. Once I get block progress, I will start adding random transactions. Once that works, I will scale up the number of nodes, probably around 250~500, then I plan to start replicating/fixing file hosting issue.

Hope you guys don't mind me doing my own effort here. I can see you're busy. If I get it to work, I will gladly commit my fix for review. I have forked you.