ArweaveTeam / arweave

The Arweave server and App Developer Toolkit.
https://www.arweave.org
GNU General Public License v2.0
916 stars 201 forks source link

[chunk] Failed to fetch chunk at offset 646971393 error #602

Closed techinged closed 2 months ago

techinged commented 2 months ago

I deployed private arweave network on local net. Everything seems ok. When I upload a 36k-size file to private arweave network, I can always download it successfully. But when I upload a 122M-size file to private arweave network, I can download it successfully after a few minutes, but after a few hours when I download the same file again, it fails and the error is such like '[chunk] Failed to fetch chunk at offset 646971393'. When I visit http://192.168.5.100:1984/chunk/646971393, it says error. When I visit http://192.168.5.100:1984/tx/rzMk0A0YZAsGYgkYBZf9D1lUhqXYrmSYgKB-UzBp7Ug, http://192.168.5.100:1984/tx/rzMk0A0YZAsGYgkYBZf9D1lUhqXYrmSYgKB-UzBp7Ug/status, and http://192.168.5.100:1984/tx/rzMk0A0YZAsGYgkYBZf9D1lUhqXYrmSYgKB-UzBp7Ug/offset, all says ok. When I visit http://192.168.5.100:1984/tx/rzMk0A0YZAsGYgkYBZf9D1lUhqXYrmSYgKB-UzBp7Ug/data, it says {"error":"tx_data_too_big"}. Who can tell me what is the problem? Why I cannot download a 122M-size file from private arweave network? It seems that the 122M-size file has been uploaded successfully. Thanks a lot.

techinged commented 2 months ago

arweave version is 2.7.4

vird commented 2 months ago

Did you specify any storage modules? If no, then data is gone from cache and no longer available

techinged commented 2 months ago

I use the below command line to start private arweave network. nohup bin/start init data_dir /opt/arweave/data/ mining_addr O2t2ASJX8mkqRhcvpFZdclhNT6lFNIRKir8--X_OrtY storage_module 0,O2t2ASJX8mkqRhcvpFZdclhNT6lFNIRKir8--X_OrtY mine &

vird commented 2 months ago

Is data really in storage module? What says /metrics?

techinged commented 2 months ago

image The screenshot above is the /metrics link prints. I don't know how to use /metrics link to figure out the problem. Are there any document links about how to set up storage module?

vird commented 2 months ago

Look at something like

v2_index_data_size_by_packing{store_id="default",packing="unpacked",partition_number="undefined",storage_module_size="undefined",storage_module_index="undefined"} 2359296
v2_index_data_size_by_packing{store_id="storage_module_26_unpacked",packing="unpacked",partition_number="26",storage_module_size="3600000000000",storage_module_index="26"} 3553407008768
techinged commented 2 months ago

image This is the screenshot. Does it mean that the storage module configuration is wrong?

vird commented 2 months ago

So it's 3 chunks 786432/(256*1024) = 3, not 122M, so data is not stored So maybe it's not a problem with arweave node but problem with interaction with arweave node Pls clarify this part I upload a 122M-size file how did you do that, what script, what was response from script, any details you have

techinged commented 2 months ago

image I just modify the storage module configuration, init data and upload 122M file again. Now the screenshot is the above. Is all ok now?

vird commented 2 months ago

Note. It's not inside packed storage module. It's in storage_id="default" So you need investigate why your arweave node is not packing What is your launch script?

techinged commented 2 months ago

I use the command line as the below to launch private arweave network. nohup bin/start init data_dir /opt/arweave/data/ mining_addr O2t2ASJX8mkqRhcvpFZdclhNT6lFNIRKir8--X_OrtY storage_module 0,O2t2ASJX8mkqRhcvpFZdclhNT6lFNIRKir8--X_OrtY mine &

techinged commented 2 months ago

My configration as the above, will the uploaded 122M file be lost, and I cannot download it a few hours later?

vird commented 2 months ago

Sry forgot that you already posted that. Script looks ok. Maybe something nearby is wrong Is there enough space? (>110 GB) Is there any warning that packing stopped because there is no space? Can you try launch it interactively in separate terminal so you can see output

vird commented 2 months ago

Is block actually mined which should include transaction? If there is no transaction which contain data, then after some time data will be deleted from cache (at least because tx will expire)

maxmetagravity commented 2 months ago

hey techinged

I've encountered the same issue. I deployed a single-node localnet environment on AWS, changing the NETWORK_NAME to ar.localnet. The initialization and restart processes were normal. Last night, I uploaded a large file of 180MB. After the transaction was confirmed, I was able to download the file correctly. However, when I tried to download it again this morning, I encountered a 400 error. This phenomenon has occurred twice. Is this related to the localnet environment?

Screenshot 2024-08-15 at 08 53 47

Additionally, in the localnet single-node environment, block production becomes faster over time. Currently, it's producing a block roughly every 5-10 seconds. Is this normal? Below is my startup configuration.

nohup ./bin/start data_dir data port 1984 mining_addr DxORDYdnBQ_L8Qs56m_j98x-wnNyQp1SKxILdxC0vss storage_module 0,DxORDYdnBQ_L8Qs56m_j98x-wnNyQp1SKxILdxC0vss mine start_from_block_index &

Note: My AWS server only has one hard drive. The storage_module is specified through in the default “data” directory.

techinged commented 2 months ago
1723686049054

Hi, the screenshot as the above is what /metrics link prints now. Does it mean storage module is ok? But I cannot download the uploaded 122M file now. I downloaded the the same file successfully a few hours ago. B.T.W. I found that it says a different drive should mount to /opt/arweave/data/storage_modules/storage_module_0_O2t2ASJX8mkqRhcvpFZdclhNT6lFNIRKir8--X_OrtY at https://docs.arweave.org/developers/mining/mining-guide. I don't do it because I only have one drive. I do find that /opt/arweave/data/storage_modules/storage_module_0_O2t2ASJX8mkqRhcvpFZdclhNT6lFNIRKir8--X_OrtY exists and it has subfolders and files. My disk space has more than 180G space available.

maxmetagravity commented 2 months ago

hey vird,could u give me a hand

the hard driver has 500G , usage is 5%

and number of confirmations about the upload file is 50+ (http://node1.321.io:1984/tx/pxY_pMQQoRdA84608cMVL7FdGV7JIvHbFccFGRnFSoY/status

Screenshot 2024-08-15 at 10 28 33

following is my upload cli nodejs scripts

============ upload_test.js ================

const Arweave = require('arweave'); const fs = require('fs'); const path = require('path');

const arweave = Arweave.init({ host: 'node1.321.io', port: 1984, protocol: 'http' });

const walletFile = './wallets/arweave_keyfile_DxORDYdnBQ_L8Qs56m_j98x-wnNyQp1SKxILdxC0vss.json'; const rawWallet = fs.readFileSync(walletFile); const wallet = JSON.parse(rawWallet);

async function uploadFile(filePath) {

const data = fs.readFileSync(filePath);
const fileName = path.basename(filePath);

console.log(`File '${fileName}' read. Size: ${data.length} bytes`);

const transaction = await arweave.createTransaction({ data: data }, wallet);

transaction.addTag('Content-Type', 'application/octet-stream');
transaction.addTag('File-Name', fileName);

await arweave.transactions.sign(transaction, wallet);

const uploader = await arweave.transactions.getUploader(transaction);

while (!uploader.isComplete) {
    await uploader.uploadChunk();
    console.log(`${uploader.pctComplete}% complete, ${uploader.uploadedChunks}/${uploader.totalChunks}`);
}

console.log('Upload completed');
console.log('Transaction ID:', transaction.id);
return transaction.id;

}

async function checkTransactionStatus(transactionId) { try { const status = await arweave.transactions.getStatus(transactionId); console.log("Transaction status:", status);

    if (status.status === 200 && status.confirmed) {
        console.log("Transaction is confirmed!");
        console.log("Number of confirmations:", status.confirmed.number_of_confirmations);
        return true;
    } else if (status.status === 202) {
        console.log("Transaction is pending.");
    } else if (status.status === 404) {
        console.log("Transaction not found. It might have been dropped or not yet propagated through the network.");
    } else {
        console.log("Unexpected status:", status);
    }
} catch (err) {
    console.error("Error checking transaction status:", err);
}
return false;

}

async function monitorTransaction(transactionId) { let confirmed = false; while (!confirmed) { confirmed = await checkTransactionStatus(transactionId); if (!confirmed) { await new Promise(resolve => setTimeout(resolve, 30000)); // waiting 30 sec } } }

async function main() {

const filePath = process.argv[2];

if (!filePath) {
    console.error('Please provide a file path as a command line argument.');
    console.log('Usage: node upload_file.js <FILE_PATH>');
    process.exit(1);
}

if (!fs.existsSync(filePath)) {
    console.error(`File not found: ${filePath}`);
    process.exit(1);
}

try {
    const transactionId = await uploadFile(filePath);
    await monitorTransaction(transactionId);
} catch (err) {
    console.error("Error:", err);
}

}

main();

========= End upload_test.js ==============

techinged commented 2 months ago

log.log Hi, vird, The attached is the log file that the launch script prints, and does it help to figure the problem out?

vird commented 2 months ago

@techinged I see that metrics show that data is actually packed and available

  1. Is this new tx available? with /tx/ endpoint?
  2. Is weave_size actually increased? Can you check /block/current for weave_size?
  3. About /chunk/ can you try low offsets 1, 524288, 26214400 I see some issue with previous offset you tried 646971393/1e6 = 646.971393. You are trying to get chunk from 646 MB but uploaded only 120 MB, so that offset is not allocated at all
vird commented 2 months ago

@maxmetagravity in your case chunk proof is available http://node1.321.io:1984/chunk_proof/1 for all 3 offsets 1, 524288, 26214400 for both /chunk and /chunk2 only returns some values if it's unpacked storage_module. They don't do unpack on fly to get chunk in any packing curl -v -H "x-packing: any" http://node1.321.io:1984/chunk/1

if x-packing is not set for compatibility with old versions API it returns unpacked. If unpacked is not available - 404 (even if chunk actually stored, but packed)

https://github.com/ArweaveTeam/arweave/blob/55dbf9754e70423d48af7eb188fb944b2b1dacb8/apps/arweave/src/ar_http_iface_middleware.erl#L1995

techinged commented 2 months ago

@vird
I have changed the configuration, and re-initialize the data directory. So some data and screenshots I posted are obsolete, and could be ignored. Please note my latest data and screenshots I posted. I'll post some new data and screenshots as you mentioned.

techinged commented 2 months ago

tx.json The attached is what http://192.168.5.211:1984/tx/P4ePyZHWXCMJx85ga6F2QUx9Y3FiiW7A7RuFD0AMIcc says

techinged commented 2 months ago

@vird http://192.168.5.211:1984/tx/P4ePyZHWXCMJx85ga6F2QUx9Y3FiiW7A7RuFD0AMIcc/offset says: {"size":"125201271","offset":"125987703"}

techinged commented 2 months ago

@vird 125987703-125201271+1=786433 http://192.168.5.211:1984/chunk/786433 says: 404 error

vird commented 2 months ago

@techinged see my previous comment about x-packing header https://github.com/ArweaveTeam/arweave/issues/602#issuecomment-2290499624

techinged commented 2 months ago

chunk.json The attached is that curl -v -H "x-packing: any" http://192.168.5.211:1984/chunk/1 produces

techinged commented 2 months ago

chunk2.json The attached is that curl -v -H "x-packing: any" http://192.168.5.211:1984/chunk/524288 produces

techinged commented 2 months ago

@vird 125987703-125201271+1=786433 http://192.168.5.211:1984/chunk/786433 says: 404 error curl -v -H "x-packing: any" http://192.168.5.211:1984/chunk/786433 is ok, it returns data

techinged commented 2 months ago

@vird Using arweave-js, how to let it use -H "x-packing: any" when arweave-js calls http://192.168.5.211:1984/chunk/786433?

techinged commented 2 months ago

@vird curl -v http://192.168.5.211:1984/chunk/786433 says: 404 error. curl -v -H "x-packing: any" http://192.168.5.211:1984/chunk/786433 is ok, it returns data. Is the cause of the problem found? Using arweave-js, how to let it use -H "x-packing: any" when arweave-js visits some api urls?

vird commented 2 months ago

The attached is that curl -v -H "x-packing: any" http://192.168.5.211:1984/chunk/1 produces

as expected

The attached is that curl -v -H "x-packing: any" http://192.168.5.211:1984/chunk/524288 produces

as expected

@vird 125987703-125201271+1=786433 http://192.168.5.211:1984/chunk/786433 says: 404 error curl -v -H "x-packing: any" http://192.168.5.211:1984/chunk/786433 is ok, it returns data

as expected

Using arweave-js, how to let it use -H "x-packing: any" when arweave-js calls

not supported https://github.com/search?q=repo%3AArweaveTeam%2Farweave-js%20x-packing&type=code Use e.g. axios and specify headers manually

Also note that chunk you get != chunk you post. It's packed. And there is no easy js way to unpack You can try to use NAPI implementation from here https://github.com/virdpool/arweave_randomx but it's code for proof of concept, not real library And it's missing 2.6 code, so you need integrate it from current arweave implementation

maxmetagravity commented 2 months ago

The attached is that curl -v -H "x-packing: any" http://192.168.5.211:1984/chunk/1 produces

as expected

The attached is that curl -v -H "x-packing: any" http://192.168.5.211:1984/chunk/524288 produces

as expected

@vird 125987703-125201271+1=786433 http://192.168.5.211:1984/chunk/786433 says: 404 error curl -v -H "x-packing: any" http://192.168.5.211:1984/chunk/786433 is ok, it returns data

as expected

Using arweave-js, how to let it use -H "x-packing: any" when arweave-js calls

not supported https://github.com/search?q=repo%3AArweaveTeam%2Farweave-js%20x-packing&type=code Use e.g. axios and specify headers manually

Also note that chunk you get != chunk you post. It's packed. And there is no easy js way to unpack You can try to use NAPI implementation from here https://github.com/virdpool/arweave_randomx but it's code for proof of concept, not real library And it's missing 2.6 code, so you need integrate it from current arweave implementation

why these happend, because localnet deploy ? how to avoid it ,and make service works in normal way

techinged commented 2 months ago

The attached is that curl -v -H "x-packing: any" http://192.168.5.211:1984/chunk/1 produces

as expected

The attached is that curl -v -H "x-packing: any" http://192.168.5.211:1984/chunk/524288 produces

as expected

@vird 125987703-125201271+1=786433 http://192.168.5.211:1984/chunk/786433 says: 404 error curl -v -H "x-packing: any" http://192.168.5.211:1984/chunk/786433 is ok, it returns data

as expected

Using arweave-js, how to let it use -H "x-packing: any" when arweave-js calls

not supported https://github.com/search?q=repo%3AArweaveTeam%2Farweave-js%20x-packing&type=code Use e.g. axios and specify headers manually

Also note that chunk you get != chunk you post. It's packed. And there is no easy js way to unpack You can try to use NAPI implementation from here https://github.com/virdpool/arweave_randomx but it's code for proof of concept, not real library And it's missing 2.6 code, so you need integrate it from current arweave implementation

@vird Do you mean that for arweave 2.7.4 version, using nohup bin/start init data_dir /opt/arweave/data/ mining_addr O2t2ASJX8mkqRhcvpFZdclhNT6lFNIRKir8--X_OrtY storage_module 0,O2t2ASJX8mkqRhcvpFZdclhNT6lFNIRKir8--X_OrtY mine & to launch private arweave network, arweave-js library cannot be used directly?

vird commented 2 months ago

@maxmetagravity

why these happend, because localnet deploy ? how to avoid it ,and make service works in normal way That's not localnet problem. That's how arweave node works. If you want unpacked chunks you should either

  • unpack them manually (see above)
  • setup arweave gateway
  • use unpacked storage modules (gateway is usually just proxy to arweave node with unpacked storage modules)
vird commented 2 months ago

@techinged This is almost ok to launch that way (there can be problems to stop that and to monitor stdout). I personally prefer screen or tmux This setup is for producing blocks, not for convenient unpacked data sharing

arweave-js is mainly used for interact with arweave.net gateway it can also interact with any arweave node on upload. arweave-js doesn't have capabilities to download + unpack packed chunks So only limited usage of arweave-js is possible for direct node interaction regardless localnet or mainnet

techinged commented 2 months ago

@techinged This is almost ok to launch that way (there can be problems to stop that and to monitor stdout). I personally prefer screen or tmux This setup is for producing blocks, not for convenient unpacked data sharing

arweave-js is mainly used for interact with arweave.net gateway it can also interact with any arweave node on upload. arweave-js doesn't have capabilities to download + unpack packed chunks So only limited usage of arweave-js is possible for direct node interaction regardless localnet or mainnet

@vird ok, thank you. I'll try to use unpacked storage modules. Hope it works. I'll give feedback.

maxmetagravity commented 2 months ago

hey techinged

I've encountered the same issue. I deployed a single-node localnet environment on AWS, changing the NETWORK_NAME to ar.localnet. The initialization and restart processes were normal. Last night, I uploaded a large file of 180MB. After the transaction was confirmed, I was able to download the file correctly. However, when I tried to download it again this morning, I encountered a 400 error. This phenomenon has occurred twice. Is this related to the localnet environment? Screenshot 2024-08-15 at 08 53 47

Additionally, in the localnet single-node environment, block production becomes faster over time. Currently, it's producing a block roughly every 5-10 seconds. Is this normal? Below is my startup configuration.

nohup ./bin/start data_dir data port 1984 mining_addr DxORDYdnBQ_L8Qs56m_j98x-wnNyQp1SKxILdxC0vss storage_module 0,DxORDYdnBQ_L8Qs56m_j98x-wnNyQp1SKxILdxC0vss mine start_from_block_index &

Note: My AWS server only has one hard drive. The storage_module is specified through in the default “data” directory.

thanks @vird

techinged commented 2 months ago

@maxmetagravity

why these happend, because localnet deploy ? how to avoid it ,and make service works in normal way That's not localnet problem. That's how arweave node works. If you want unpacked chunks you should either

  • unpack them manually (see above)
  • setup arweave gateway
  • use unpacked storage modules (gateway is usually just proxy to arweave node with unpacked storage modules)

@vird Does Arweave 2.7.4 support mining using unpacked storage modules?

vird commented 2 months ago

No. Since 2.5 you can't mine on unpacked

maxmetagravity commented 2 months ago

No. Since 2.5 you can't mine on unpacked it it meaning that it could use a peer node with unpacked stroage_modules as a gateway (no mine) for interact with arweave.JS for test

vird commented 2 months ago

As far as I know, only affected API endpoint in arweave-js is /chunk https://github.com/ArweaveTeam/arweave-js/blob/7d25a16755830e7d342d700c2ff345ca757b6b34/src/common/chunks.ts#L27 So for use getChunk you should use it against gateway (e.g. arweave.net) or for testnet/localnet you should have arweave node which for example will have 2 modules:

maxmetagravity commented 2 months ago

As far as I know, only affected API endpoint in arweave-js is /chunk https://github.com/ArweaveTeam/arweave-js/blob/7d25a16755830e7d342d700c2ff345ca757b6b34/src/common/chunks.ts#L27 So for use getChunk you should use it against gateway (e.g. arweave.net) or for testnet/localnet you should have arweave node which for example will have 2 modules:

  • packed (so node will mine blocks)
  • unpacked (so you can access unpacked)

@vird if we start the node with following nohup ./bin/start data_dir data port 1984 mining_addr DxORDYdnBQ_L8Qs56m_j98x-wnNyQp1SKxILdxC0vss \ storage_module 0,DxORDYdnBQ_L8Qs56m_j98x-wnNyQp1SKxILdxC0vss mine \ storage_module 0,unpacked \ sync_jobs 200 \ start_from_block_index &

Is the configuration above correct? Are the packing and unpacking processes automatically synchronized? or it packing first then uppacking and store the packed & unpacked chunks in different sub_dir like storage_module_0_DxORDYdnBQ_L8Qs56m_j98x-wnNyQp1SKxILdxC0vss and storage_module_0_unpacked ?

If the configuration is correct, and we upload a file using arweave.js, after the cache expires in 4 hours, how can we ensure that the file is downloaded from the unpacked storage modules when using arweave.js to retrieve it?

vird commented 2 months ago

Is the configuration above correct?

maybe you need add \ to each line except last line if want to make command multiline

Are the packing and unpacking processes automatically synchronized?

Automatically

how can we ensure that the file is downloaded from the unpacked storage modules

If you want to check contents: download and compare with file you sent before arweave-js downloadChunkedData(txid) should do the thing (not sure you really intended to ask this, but here is long answer anyway) If you want ensure that it's downloaded from unpacked storage modules - there is no other way to download it, arweave will not send you any chunks (except packed see above) if you don't have unpacked storage module. If you really-really want to ensure, patch erlang code of /chunk and subsequent calls, so you can see where it came from

Alternatively you can start 2 nodes one mining with packed, other just serving with unpacked and specify each other with peer <other node ip>:<other node port> If you access node with unpacked data then it will read from that node If you really want to make sure, stop node with packed data

maxmetagravity commented 2 months ago

Oh, its my bad express,what I mean is, how can we specify that arweave.js should download the file from the unpacked storage modules instead of the packed storage modules through the same TX ID ? Does it prioritize downloading content from the unpacked storage modules?

vird commented 2 months ago

It can only pick from unpacked. It should be automatically, you can't specify storage module in request

maxmetagravity commented 2 months ago

@vird Thank you for your thoughtful response and exceptional technical skills. The ARweave team is the best with you on board. :)

techinged commented 2 months ago

As far as I know, only affected API endpoint in arweave-js is /chunk https://github.com/ArweaveTeam/arweave-js/blob/7d25a16755830e7d342d700c2ff345ca757b6b34/src/common/chunks.ts#L27 So for use getChunk you should use it against gateway (e.g. arweave.net) or for testnet/localnet you should have arweave node which for example will have 2 modules:

  • packed (so node will mine blocks)
  • unpacked (so you can access unpacked)

@vird have 2 modules: packed (so node will mine blocks), unpacked (so you can access unpacked). It works. Thank you. I close this issue.