mikeizbicki / cmc-csci143

big data course materials
40 stars 76 forks source link

Confused about removing other unique constrants. #494

Closed echen4628 closed 7 months ago

echen4628 commented 7 months ago

In the normalized_batch_parallel part of the assignment, we need to remove unique constraints from the other columns of the table. The instructions read:

There are also several other UNIQUE constraints (mostly in PRIMARY KEYs) that need to be removed from other columns of the table. 

Looking at services/pg_normalized_batch/schema.sql, I don't see any other unique columns besides url from the urls table which I removed. To me, this means that the only unique constraints left are in the primary keys (like id_users or id_tweets). I was under the impression that primary keys must be unique. Can someone give a hint on how to remove the unique constraints in the other columns?

Perhaps, when id_users is mentioned in the tweets table, I remove it from being a foreign key and also change its type to TEXT. Any idea is helpful! Thanks.

mikeizbicki commented 7 months ago

You're correct that PRIMARY KEY implies UNIQUE. What I was trying to say is that this means that all PRIMARY KEY constraints must also be removed. So for example, the first lines of the tweets table should be changed from

CREATE TABLE tweets (
    id_tweets BIGINT PRIMARY KEY,
    id_users BIGINT,

to

CREATE TABLE tweets (
    id_tweets BIGINT,
    id_users BIGINT,