Open weaverba137 opened 5 years ago
@jacquesalice, this very old ticket was was ultimately based on the idea that the tbl_stat
table was not being updated regularly. Do you know if that is now updated routinely?
Hi @weaverba137 it does look like tbl_stat
is being regularly updated, but it doesn't look like it properly stores the row count (nrows
) for Views. So for a View like sdss_dr17.specobj, it says nrows=0
. Example:
query = "SELECT nrows FROM tbl_stat WHERE schema='sdss_dr17' and tbl_name='specobj'"
response = qc.query(sql=query)
print(response)
--------
nrows
0
I'm not sure if this is something @mjfitzpatrick could easily implement, or if we should just add a note in the notebooks saying that tbl_stat
cannot be used to obtain information about tables stored as Views.
The row counts in tbl_stat are only approximations from one of the postgres system tables, that approximation cannot be used for either views or FDW tables and so values are reported as zero. To get a full set of information for a table, the stat would need to do a full "select count(*)...." on each view, or execute on the remote host of an FDW and be merged back in on the production db.
The tbl_stat was initially an ad hoc tool when the system was first built. Since then it's been used in various places despite never being deemed part of the production system. As such, updates are sporadic, backups are coincidental and work to fill in the view/fdw issues has never really come up.
On Wed, Aug 23, 2023 at 4:00 PM Alice Jacques @.***> wrote:
Hi @weaverba137 https://github.com/weaverba137 it does look like tbl_stat is being regularly updated, but it doesn't look like it properly stores the row count (nrows) for Views. So for a View like sdss_dr17.specobj, it says nrows=0. Example:
query = "SELECT nrows FROM tbl_stat WHERE schema='sdss_dr17' and tbl_name='specobj'" response = qc.query(sql=query) print(response)
nrows 0
I'm not sure if this is something @mjfitzpatrick https://github.com/mjfitzpatrick could easily implement, or if we should just add a note in the notebooks saying that tbl_stat cannot be used to obtain information about tables stored as Views.
— Reply to this email directly, view it on GitHub https://github.com/astro-datalab/notebooks-latest/issues/5#issuecomment-1690752194, or unsubscribe https://github.com/notifications/unsubscribe-auth/ABP7I6YFMUQP74EZNRW763DXW2DQHANCNFSM4HJQ6XXA . You are receiving this because you were mentioned.Message ID: @.***>
The
tbl_stat
table is meant for internal operations, but several references to it exist in these notebooks:DataLabNotebookTemplate.ipynb
01_GettingStartedWithDataLab/02_GettingStartedWithDataLab.ipynb
03_ScienceExamples/LargeScaleStructure/LargeScaleStructureSdssLs.ipynb
03_ScienceExamples/ExploringM31/M31WithPhat.ipynb
04_HowTos/QueryClient/How_to_use_the_Data_Lab_QueryClient.ipynb
The table does not appear to be updated automatically, and until very recently, was not installed at all on
gp04
. Furthermore, the row counts are approximations. There are probably better and more "official" ways to get table sizes frompg_catalog
. Additional metadata would be available in thetap_schema
tables, e.g. the list of available schemas.