googleapis / python-bigquery-magics

https://googleapis.dev/python/bigquery-magics/latest/
Apache License 2.0
2 stars 1 forks source link

Request to improve the `tqdm` progress bar and output #35

Open meredithslota opened 1 year ago

meredithslota commented 1 year ago
          hi @aribray, looks like the issue is now solved! Thank you very much for that.

I only have one nitpick regarding the new progress bar setup: it's a bit weird to have two progress bars, and one of them is pushed to the right when the job ID appears:

image

I suppose it would be better to have just a single progress bar on its own line (job ID should be its own line too).

If you want to make the output more complete, I've been also showing the data processed at the very end in my own bigquery magic like this:

from humanize.filesize import naturalsize
...
if args.info:
    processed = naturalsize(query.total_bytes_processed)
    display(ipywidgets.HTML(value='Data processed: ' + (
        f'<b>{processed}</b>' if not query.cache_hit
        else 'none <b>(returned from cache)</b>'
    )))

With this addition the output looks like this:

image

It might look better if you add a colon after the "Job ID" too, but that's just my subjective preference.

Originally posted by @GergelyKalmar in https://github.com/googleapis/python-bigquery/issues/1146#issuecomment-1305399150

tswast commented 10 months ago

I believe the two bars are from (a) running the query and (b) downloading the data. These are created in two very different places in the code, so implementing this will require some refactoring.

chalmerlowe commented 9 months ago

@tswast @meredithslota

If the two progress bars represent two separate processes (running the query and downloading the data), as Tim describes, do we need to refactor this to create only one progress bar OR is it more informative for the user to have the two progress bars for long running processes OR large data downloads?

If we choose to go with only one bar.... seems like the "running the query" bar is generally more informative on a system level.

tswast commented 2 months ago

Per https://github.com/googleapis/python-bigquery/pull/1965 moving to bigquery-magics repo.

Regarding choosing one progress bar, we could perhaps instead output a link to the job for long-running queries:

https://github.com/googleapis/python-bigquery-dataframes/blob/014765c22410a0b4559896d163c440f46f7ce98f/bigframes/formatting_helpers.py#L233

for this job linking logic in BigQuery DataFrames.

See also: https://github.com/googleapis/python-bigquery-magics/issues/15#issuecomment-2195564214