ipld / js-car

Content Addressable aRchive format reader and writer for JavaScript
Other
46 stars 7 forks source link

Very strange issue with `writer.close` (silently closes entire node process) #140

Closed piboistudios closed 1 year ago

piboistudios commented 1 year ago

Using @ipld/car version 5.1.1

This simple code snippet is self explanatory, and happens as the console.logs suggest... though you would expect that it doesn't:

import { CarWriter } from "@ipld/car";

const main = async () => {
    console.log('wtf');
    const { writer, out } = await CarWriter.create();
    console.log('lol');
    await writer.close();
    console.log('this never prints'); // this really never prints
}

main()
    .then(() => {
        console.log('done'); // this also never prints
    })
    .catch(e => {
        console.error('ERROR:', e); // no errors either
    })

REPL to prove I'm not lying: https://replit.com/@GabrielSpeed/ToughRadiantMarketing#index.js (Also happens with BunJS https://replit.com/@GabrielSpeed/CreepyGigaGnudebugger#index.ts)

ETA: this happens even if the CAR is created with roots, and also even if the CAR is created with roots and the block containing the root is actually in the CAR.

Basically this happens no matter what state the CarWriter is in. Not sure why or how, but I will tell you the bug is going to be incredibly hard to solve, because it prints no error message or anything at any level.

Seems to have something to do with methods on the _encoder of the CarWriter. Something about the way the _mutex and _encoder play together (and that IteratorChannel_Writer class) is not working out well, and seems to be a problem similar to this: https://stackoverflow.com/questions/64897131/async-for-loop-causes-the-function-to-silently-terminate

piboistudios commented 1 year ago

Also, what was interesting is that if you use writer._encoder.close() directly, it works fine... for a while.

It will still eventually close the process if you run enough operations concurrently (on separate writers). Definitely something going on with the encoder.

piboistudios commented 1 year ago

So, apparently, the CAR Writer only works if the AsyncIterable<UInt8Array> is converted to a node Stream at some point (e.g. via Readable.from(out).pipe(fs.createWriteStream('test' + '.car'))).

If you try to consume the AsyncIterable directly, the entire application will crash.

REPL showing the program executes to completion when you add a line converting it to a readable and pipeing the readable to disk... https://replit.com/@GabrielSpeed/RipeAltruisticTransformation#index.js

piboistudios commented 1 year ago

Yeah... there is something quite unstable about how this library is writing CARs.

I have, at this point, had to wrap the code dealing with the CarWriter with a call from async.retry to have it retry a few times as it would randomly fail with the reasoning being Unexpected end of data

piboistudios commented 1 year ago

The CarBlockIterator's static method fromIterable also produces an iterable that appears to just hang.

Causing ipfs.dag.import to likewise, just hang: https://github.com/ipfs/js-ipfs/blob/b64d4af034f27aa7204e57e6a79f95461095502c/packages/ipfs-core/src/components/dag/import.js#L92

This is using the CAR produced by the CarWriter class.

The CAR file: https://filebin.net/s55v8pfj6b52q0ia

piboistudios commented 1 year ago

It seems like all of the issues with this library, for writing at least, stem from the fact that the writer attempts to allow concurrent puts; there is a lot of code written to accommodate this behavior... (and when you actually use it, it just produces bugs...)

rvagg commented 1 year ago

I hope the venting was therapeutic. But a word of advice - if you want help from authors of open source libraries, abuse and criticism that isn't framed constructively rarely leads to positive outcomes.

This looks like simply a usage issue here, or a misunderstanding about what is going on.

    const { writer, out } = await CarWriter.create();
    console.log('lol');
    await writer.close();

What are you expecting to happen here? The CAR writer is trying to write a header somewhere but you're not consuming it. out needs to go somewhere, it the whole construct can't be closed and cleaned up until that's flushed.

If you need to steam I'd also recommend using pipeline() rather than pipe() directly so you get errors handled properly.

Set up your out consumption before you write into writer. You also need to take care of the asynchronous nature of the flow, you can't manage both of these things in the same flow (awaiting the out chunks while also awaiting the writer writes, you'll just get stuck).

piboistudios commented 1 year ago

What are you expecting to happen here?

    .catch(e => {
        console.error('ERROR:', e); // no errors either
    })

Really anything but the program silently closing.

The CAR writer is trying to write a header somewhere but you're not consuming it. out needs to go somewhere, it the whole construct can't be closed and cleaned up until that's flushed.

This behavior also occurs if the CAR Writer is instantiated with roots. It's not related to the CAR being empty at all, and if that is in fact incorrect behavior, you should throw an error... especially now knowing that in this case the program will just silently close.

Set up your out consumption before you write into writer. You also need to take care of the asynchronous nature of the flow, you can't manage both of these things in the same flow (awaiting the out chunks while also awaiting the writer writes, you'll just get stuck).

How is that at all intuitive or obvious from any of the documentation, examples or test cases the library has?

The documentation presents awaiting writes as an option, not a requirement if you are awaiting downstream.

The way you are intending this library to be used is not forthcoming at all.