open-contracting / ocdskit

A suite of command-line tools for working with OCDS data
https://ocdskit.readthedocs.io
BSD 3-Clause "New" or "Revised" License
17 stars 6 forks source link

Add options to yield multiple packages #115

Closed jpmckinney closed 4 years ago

jpmckinney commented 4 years ago

Follow-up to #83

Yielding a single package can lead to all memory being consumed.

We can do something similar to split-record-packages to yield a package once it reaches a given number of records. This applies to all commands that yields a single package.

jpmckinney commented 4 years ago

Yielding a single package can lead to all memory being consumed.

Fixed in #115.

We can do something similar to split-record-packages to yield a package once it reaches a given number of records. This applies to all commands that yields a single package.

--size option added to:

Not yet added to:

For compile, we can run:

cat files | ocdskit compile --package | ocdskit echo --root-path records.item | ocdskit package-records --size 1000

To avoid the echo step, we can change the --package option of the compile command to be an --into choice between 'package' and 'records', e.g.

cat files | ocdskit compile --into records | ocdskit package-records --size 1000

Since this would require a change to the API (to prevent incoherent options like --package --records and return_package=True, return_records=True), we'll postpone until there's demand.

jpmckinney commented 4 years ago

Opened follow-up issue #122, so closing.