Daemon.jsonrpc_file_read: read claims from a file and download them

belikor commented 3 years ago

This follows after #3422.

The idea with #3422 is to produce a file with a list of claims. With this pull request we take that written file, parse it to get the claim IDs, and then download each of the streams. The file is a comma-separated values (CSV) file, although by default we use the semicolon ; as separator.

lbrynet file summary --file=sumary.txt

lbrynet file read --file=summary.txt

Basically, the idea is that we can share lists of claims to other users of the LBRY network, and they can import these lists into their own computers (through lbrynet or the LBRY Desktop application) so that they can download the same claims that we have, and thus help seed the same content that we are seeding.

This is a prototype implementation; it works when the number of claims is relatively small; however, once the number of claims is large, more than 500 or so, the Daemon.jsonrpc_file_read method will time out, so it won't finish processing the list. I'm not sure what can be done to make sure it processes a big list without timeouts.

The obvious solution is to not implement this in the SDK itself, but parse the file, and call lbrynet get on each of the claims.

# Pseudocode

lines = parse_file("summary.txt")

for item in lines:
    lbrynet get item["claim_id"]

Then each call to get will be separate from each other, each will have its own timeout.

Also, since the file is meant to contain the 'claim_id', get should be able to handle claim IDs, as proposed in #3411.

coveralls commented 3 years ago

Coverage decreased (-0.5%) to 67.453% when pulling d9acdb858a99c961595bac1a874be91567331a06 on belikor:print-summary-read into 561566e72363aebc74144c994adb2cc869c7d424 on lbryio:master.

eukreign commented 3 years ago

@lyoshenka this PR involves API changes, please review

belikor commented 3 years ago

it works when the number of claims is relatively small; however, once the number of claims is large, more than 500 or so, the Daemon.jsonrpc_file_read method will time out,

Is there a way to increase the time out? I wonder if I can just pass the --timeout option all the way to the jsonrpc_get method. The idea is that if we pass a file with an arbitrary number of claims, say 5000, the method will process every single item.

lyoshenka commented 3 years ago

I'd prefer not to add this feature. It can be accomplished with a few lines of scripting, and as you pointed out it doesn't work when there are many claims (at which point you fall back to scripting anyway).

As I said in https://github.com/lbryio/lbry-sdk/pull/3422#issuecomment-924200281, we should aim to keep the API simple.

lbryio / lbry-sdk

Daemon.jsonrpc_file_read: read claims from a file and download them #3423