datalad / datalad-registry

MIT License
0 stars 2 forks source link

datalad-registry-search CLI/API function #316

Open yarikoptic opened 8 months ago

yarikoptic commented 8 months ago

pending

and likely could be done in tandem with that.

Attn @fangq .

candleindark commented 8 months ago

@yarikoptic We have pagination enabled in the API. How do you want to handle the situation in which the search result contains a large number of URLs? Do you want the CLI just spit out all to them at once or in pages?

yarikoptic commented 8 months ago

spit out as soon as page received, go to next page - spit out and so on.

candleindark commented 8 months ago

@yarikoptic I assume that you want a the output to the terminal to be a list of datasets as JSON representation similar to our web API, e.g.

[
    {
      "url": "https://github.com/OpenNeuroDatasets-JSONLD/ds001297.git",
      "id": 31126,
      "ds_id": "2e429bfe-8862-11e8-98ed-0242ac120010",
      "head_describe": "00001-4-g2c4c2f7",
      "annex_key_count": 0,
      "annexed_files_in_wt_count": 495,
      "annexed_files_in_wt_size": 10593648758,
      "last_update_dt": "2024-01-16T17:47:49.227045+00:00",
      "git_objects_kb": 1045,
      "processed": true,
      "last_chk_dt": "2024-03-20T09:56:22.797062+00:00"
    },
    {
      "url": "https://github.com/OpenNeuroDatasets-JSONLD/ds000233.git",
      "id": 30727,
      "ds_id": "aa38f5a2-89f6-11e8-abf3-0242ac120004",
      "head_describe": "1.0.1-4-g3a10cc7",
      "annex_key_count": 0,
      "annexed_files_in_wt_count": 362,
      "annexed_files_in_wt_size": 5090155130,
      "last_update_dt": "2024-01-16T17:41:23.926921+00:00",
      "git_objects_kb": 922,
      "processed": true,
      "last_chk_dt": "2024-03-20T09:25:21.326719+00:00"
    }
]

I can yield a per page result using get_status_dict such as https://github.com/datalad/datalad-extension-template/blob/793c5543b8f1385e007ceb2b8cd1db667b9d38e6/datalad_helloworld/hello_cmd.py#L62-L76. However, each yield will add some "boilerplate" surrounding the result in each page. Does Datalad provide any mechanism for an extension to yield outputs devoid of the "boilerplate" so that page results can be yield to the terminal without separations.

Screenshot 2024-03-20 at 4 11 28 PM

yarikoptic commented 8 months ago

yes,

I believe it is custom_result_renderer -- see uses https://github.com/search?q=repo%3Adatalad%2Fdatalad%20custom_result_renderer&type=code