segasai / q3c

PostgreSQL extension for spatial indexing on a sphere
GNU General Public License v2.0
76 stars 27 forks source link
astronomy coordinates nearest-neighbors postgresql query sphere

Build Status Coverage Status

Q3C

Author: Sergey Koposov, University of Edinburgh

Copyright (c) 2004-2024 Sergey Koposov

Email: skoposov AT ed DOT ac DOT uk

Fresh versions of the software could be obtained here: https://github.com/segasai/q3c

To read more about the Q3C indexing, you can check out the paper published in ADASS conference proceedings http://adsabs.harvard.edu/abs/2006ASPC..351..735K The citation is "Koposov, S., & Bartunov, O. 2006, Astronomical Society of the Pacific Conference Series, 351, 735". Q3C is also registered in the ASCL library https://ascl.net/1905.008 . If you use Q3C, you are kindly asked to cite the 2006 paper. I am also always happy to hear about any usage of Q3C.

Prerequisites

In order to use Q3C you need to have a PostgreSQL database installed (version 9.1 or later). If you have PostgreSQL version lower than 9.1, you will need to use an older version of Q3C (1.4.x).

To successfully compile Q3C you must have pg_config in your PATH (that means that you may need to install the -devel versions of PostgreSQL packages)

Installation

After the installation you will have several new functions in PostgreSQL. All names of these functions start with the "q3c_" prefix.

Updating

If you are updating from previous version of q3c, you still need to do the make, make install steps, but after that you need to do

instead of 'CREATE EXTENSION'. Here A.B.C is the placeholder for the version, i.e. '2.0.0'; You also may want to check what version of q3c is installed by either of following commands:

Table preparation for Q3C

To begin use Q3C for searches and cross-matches you should create the indexes on your tables.

In this demonstration we'll assume that you have the table called "mytable" with "ra" and "dec" columns (right ascension and declination in degrees).

First, you will need to create the spatial index, using the command:

my_db# CREATE INDEX ON mytable (q3c_ang2ipix(ra, dec));

The next procedure is optional but strongly recommended: cluster the table using newly created index. The clustering procedure is the procedure of ordering the data on the disk according to the Q3C spatial index values, which will ensure faster queries if your table is very large. If the data have been ingested in the database in ordered fashion (i.e. along some spherical zones), the clustering step can be omitted (although still recommended). The clustering step may take a while (hours) if your dataset is large.

my_db# CLUSTER mytable_q3c_ang2ipix_idx ON mytable;

Alternatively, instead of CLUSTER, you can also just reorder your table yourself before indexing (can be faster) my_db# create table mytable1 as select * from mytable order by q3c_ang2ipix(ra,dec);

The last step after creating the index is analyzing your table:

my_db# ANALYZE mytable;

Now you should be able to use q3c queries.

Q3C functions

IMPORTANT Throughout q3c it is assumed that all the angles (ra, dec and distances) are in units of angular degrees, the proper motions are in mas/year, and that the units for the epochs are years, i.e. 2000.5, 2010.5.

Throughout the rest of the text I will use ipix as reference to the 64 bit integer identifier of the pixel on the sphere in Q3C.

The functions installed by Q3C are:

Query examples

my_db# SELECT * FROM mytable WHERE q3c_radial_query(ra, dec, 11, 12, 0.1);

The order of arguments of q3c_radial_query() is important, so that the column names of the table should come first, and the location where you search after, otherwise the index won't be used.

There is also an alternative way of doing cone searches which could be a bit faster if the table that you are working with that table that is small. In that case q3c_radial_query may be too CPU heavy. So you may want to query the table:

  my_db# SELECT * FROM mytable WHERE q3c_join(11, 12, ra, dec, 0.1);

Note here ra,dec column names are 3rd and 4th argument respectively.

my_db# SELECT * FROM mytable WHERE
        q3c_poly_query(ra, dec, ARRAY[0, 0, 2, 0, 2, 1, 0, 1]);
my_db# SELECT * FROM table1 AS a, table2 AS b WHERE
        q3c_join(a.ra, a.dec, b.ra, b.dec, 0.001);

The order of arguments is important again, because it determines whether an index is going to be used or not. The ra,dec columns from the table with the index should go after the ra,dec columns from the table without the index.

It is important that the query will return ALL the pairs within the matching distance, rather than just nearest neighbors. See below for examples of nearest neighbor queries.

If every object in table1 have their own error circle/matching radius ( we'll assume that the radius of that circle in degrees is stored in the column "err"), then you should run the query:

my_db# SELECT * FROM table1 AS a, table2 AS b WHERE
        q3c_join(a.ra, a.dec, b.ra, b.dec, a.err);

It is possible to do the join when the error area of each record of the catalogue is an ellipse. Then you can do the query like this

my_db# SELECT * FROM table1 AS a, table2 AS b WHERE
        q3c_ellipse_join(a.ra, a.dec, b.ra, b.dec, a.maj_ax
        a.axis_ratio, a.PA);

where axis_ratio is the column with axis ratio of the ellipses and PA is the column with the positional angles of them, and maj_ax is the column with semi-major axes of those ellipses.

my_db# SELECT * FROM table1 AS a, table2 AS b WHERE
        q3c_join_pm (a.ra, a.dec, a.pmra, a.pmdec, 1,
        a.epoch, b.ra, b.dec, b.epoch, 30, 0.001);

If you want only the objects that have the neighbours then the query will look like that

my_db# SELECT  t.*, ss.* FROM mytable AS t,
       LATERAL (
               SELECT s.* 
                    FROM 
                        sdssdr9.phototag AS s
                    WHERE
                        q3c_join(t.ra, t.dec, s.ra, s.dec, 1./3600)
                    ORDER BY
                        q3c_dist(t.ra,t.dec,s.ra,s.dec)
                    ASC LIMIT 1
               ) as ss ;

This query selects the only nearest neighbour for each row in your table. If there are no neighbours, the columns are filled with nulls. This query requires presence of some object id column with the index on the table.

my_db# WITH x AS MATERIALIZED (
      SELECT *, ( SELECT objid FROM sdssdr9.phototag AS p WHERE q3c_join(m.ra, m.dec, p.ra, p.dec, 1./3600)
                  ORDER BY q3c_dist(m.ra, m.dec, p.ra, p.dec) ASC LIMIT 1) AS match_objid  FROM mytable AS m 
          )
    SELECT * FROM x, sdssdr9.phototag AS s WHERE x.match_objid=s.objid;

Limitations

Performance issues/Slow queries

If you experience slow q3c queries, the following list may suggest possible solutions.