gvwilson / sql-tutorial

The Querynomicon: An Introduction to SQL for Wary Data Scientists
https://gvwilson.github.io/sql-tutorial/
Other
447 stars 47 forks source link

Removing primary key from 040 #27

Closed defuneste closed 8 months ago

defuneste commented 8 months ago

As mentioned here https://mastodon.social/@gvwilson/111873017779613861 primary key are linked to multiple topics (very useful for joins but also part a data constraint/part of designing a DB).

One option to avoid this difficult topic in 040 could just be using an unique id w/o telling the reader that it could be a PK.

I do not know if it is possible in SQLite but here is an example with PostgresSQL:

create table staging.person (
    ident  serial,
    name text not null
);

insert into staging.person (name)
values
    ('mik'),
    ('po'),
    ('tay');

Sadly if followed by this:

insert into staging.person (ident, name)
values
    (1, 'bip');

It will return:

ident name
1 "mik"
1 "bip"
2 "po"
3 "tay"

but instead this is working "as expected" (or follow the same pattern than the one used to generate the first 3 rows):

insert into staging.person (name)
values
    ('bip');
ident name
1 "mik"
2 "po"
3 "tay"
4 "bip"

Link to SERIAL (and big serial): https://www.postgresql.org/docs/current/datatype-numeric.html#DATATYPE-SERIAL

Then PK and FK could be introduced in a part about normalization.

This "SQL tutorial" is a great idea, thanks a lot for writing it!

defuneste commented 8 months ago

I just realized I used a "staging" schema that is not needed in your example ...

gvwilson commented 8 months ago

Thanks for the suggestion - I'm not familiar with "staging" schemas, so could you please expand a bit on the idea? The file misc/sql_keywords.txt contains a list of all keywords recognized by SQLite, and unfortunately serial doesn't appear among them - is it PostgreSQL-specific?

defuneste commented 8 months ago

PG has the concept of "schema" to organize a database and their users. Here, I am using "staging" ie a specific schema to do some test instead of going in my "$user" or the "public" schema (if you do not specify the schema PG will use a search_path and go for the first you are authorized to write on). The name "staging" has no particular meaning I just learned that way and while I was using your code in my local version of PG I have used it without thinking. I think the meaning of "schema" is different from between PG and SQLite. In the first one it also includes permissions/privileges (unsure if SQlite has that).

Yes serial (and bigserial) are Postrgesql data type (afaik they are specific) and similar to autoincrement (but do not require the PK ...).