mikeizbicki / cmc-csci143

big data course materials
40 stars 77 forks source link

Final Project: Install Rum #537

Open abizermamnoon opened 1 month ago

abizermamnoon commented 1 month ago

Hi,

To create RUM, I first modified my Dockerfile in services/postgres directory:

lambda-server:~/bigdata/final_project $ cat services/postgres/Dockerf
ile
FROM postgres:13
# install system packages for building postgres extensions
RUN apt-get update && apt-get install -y \
    less \
    make \
    vim \
    postgresql-server-dev-13 \
    postgresql-plpython3-13 \
    python3 \
    python3-pip \
    sudo \
    wget

# install a newer version of git
# these libraries are required just for installing a new version of git
RUN apt-get install -y \
    libcurl4-gnutls-dev \
    libexpat1-dev gettext \
    libz-dev \
    libssl-dev \
    asciidoc \
    xmlto \
    docbook2x
RUN cd /tmp \
 && wget https://www.kernel.org/pub/software/scm/git/git-2.30.1.tar.gz \
 && tar -xzf git-2.30.1.tar.gz \
 && cd git-2.30.1 \
 && ./configure \
 && make \
 && make install \
 && rm -rf /tmp/git-2.30.1

# install postgres extensions from source
RUN cd /tmp \
 && git clone https://github.com/postgrespro/rum \
 && cd rum \
 && git checkout 1.3.7 \
 && make USE_PGXS=1 \
 && make USE_PGXS=1 install \
 && rm -rf /tmp/rum

WORKDIR /tmp/db

RUN mkdir /data && chown postgres /data

# copy over the pagila database;
# we rename the files so that they get executed in the correct order
COPY schema.sql /docker-entrypoint-initdb.d/01.sql

When I run :

docker-compose up -d --build

I get the following error:

Unpacking postgresql-client-13 (13.14-1.pgdg120+2) over (13.13-1.pgdg120+1) ...
dpkg: error processing archive /tmp/apt-dpkg-install-78gelh/136-postgresql-client-13_13.14-1.pgdg120+2_amd64.deb (--unpack):
 cannot copy extracted data for './usr/lib/postgresql/13/bin/pg_isready' to '/usr/lib/postgresql/13/bin/pg_isready.dpkg-new': failed to write (Disk quota exceeded)
dpkg: error while cleaning up:
 unable to restore backup version of '/usr/lib/postgresql/13/bin/pg_dumpall': Stale file handle
dpkg: error while cleaning up:
 unable to restore backup version of '/usr/lib/postgresql/13/bin/pg_dump': Stale file handle
dpkg: error while cleaning up:
 unable to restore backup version of '/usr/lib/postgresql/13/bin/pg_config': Stale file handle
dpkg: error while cleaning up:
 unable to restore backup version of '/usr/lib/postgresql/13/bin/pg_basebackup': Stale file handle
dpkg: error while cleaning up:
 unable to restore backup version of '/usr/lib/postgresql/13/bin/dropuser': Stale file handle
dpkg: error while cleaning up:
 unable to restore backup version of '/usr/lib/postgresql/13/bin/dropdb': Stale file handle
dpkg: error while cleaning up:
 unable to restore backup version of '/usr/lib/postgresql/13/bin/createuser': Stale file handle
dpkg: error while cleaning up:
 unable to restore backup version of '/usr/lib/postgresql/13/bin/createdb': Stale file handle
dpkg: error while cleaning up:
 unable to restore backup version of '/usr/lib/postgresql/13/bin/clusterdb': Stale file handle
tar: ./md5sums: Wrote only 9216 of 10240 bytes
tar: ./postinst: Cannot write: Disk quota exceeded
tar: ./postrm: Cannot write: Disk quota exceeded
tar: ./preinst: Cannot write: Disk quota exceeded
tar: ./prerm: Cannot write: Disk quota exceeded
tar: ./templates: Cannot write: Disk quota exceeded
tar: Exiting with failure status due to previous errors
dpkg-deb: error: tar subprocess returned error exit status 2
dpkg: error processing archive /tmp/apt-dpkg-install-78gelh/137-postgresql-13_13.14-1.pgdg120+2_amd64.deb (--unpack):
 dpkg-deb --control subprocess returned error exit status 2
Selecting previously unselected package postgresql-plpython3-13.
Preparing to unpack .../138-postgresql-plpython3-13_13.14-1.pgdg120+2_amd64.deb ...
Unpacking postgresql-plpython3-13 (13.14-1.pgdg120+2) ...
dpkg: error processing archive /tmp/apt-dpkg-install-78gelh/138-postgresql-plpython3-13_13.14-1.pgdg120+2_amd64.deb (--unpack):
 cannot copy extracted data for './usr/lib/postgresql/13/lib/bitcode/hstore_plpython3.index.bc' to '/usr/lib/postgresql/13/lib/bitcode/hstore_plpython3.index.bc.dpkg-new': failed to write (Disk quota exceeded)
tar: ./md5sums: Wrote only 4608 of 10240 bytes
tar: Exiting with failure status due to previous errors
dpkg-deb: error: tar subprocess returned error exit status 2
dpkg: error processing archive /tmp/apt-dpkg-install-78gelh/139-postgresql-server-dev-13_13.14-1.pgdg120+2_amd64.deb (--unpack):
 dpkg-deb --control subprocess returned error exit status 2
Selecting previously unselected package publicsuffix.
Preparing to unpack .../140-publicsuffix_20230209.2326-1_all.deb ...
Unpacking publicsuffix (20230209.2326-1) ...
dpkg: error processing archive /tmp/apt-dpkg-install-78gelh/140-publicsuffix_20230209.2326-1_all.deb (--unpack):
 cannot copy extracted data for './usr/share/doc/publicsuffix/copyright' to '/usr/share/doc/publicsuffix/copyright.dpkg-new': failed to write (Disk quota exceeded)
Selecting previously unselected package python3.11-dev.
Preparing to unpack .../141-python3.11-dev_3.11.2-6_amd64.deb ...
dpkg: unrecoverable fatal error, aborting:
 unable to flush /var/lib/dpkg/updates/tmp.i after padding: Disk quota exceeded
E: Sub-process /usr/bin/dpkg returned an error code (2)
The command '/bin/sh -c apt-get update && apt-get install -y     less     make     vim     postgresql-server-dev-13     postgresql-plpython3-13     python3     python3-pip     sudo     wget' returned a non-zero code: 100
ERROR: Service 'postgres' failed to build : Build failed

I checked my disk quota:

lambda-server:~/bigdata/final_project $ df -h
Filesystem      Size  Used Avail Use% Mounted on
udev            126G     0  126G   0% /dev
tmpfs            26G  5.4M   26G   1% /run
/dev/nvme0n1p2  1.8T  1.4T  253G  85% /
tmpfs           126G  112K  126G   1% /dev/shm
tmpfs           5.0M     0  5.0M   0% /run/lock
tmpfs           126G     0  126G   0% /sys/fs/cgroup
/dev/nvme0n1p1  511M  5.3M  506M   2% /boot/efi
tmpfs            26G  140K   26G   1% /run/user/1070
tmpfs            26G     0   26G   0% /run/user/1125
tmpfs            26G     0   26G   0% /run/user/1151
tmpfs            26G     0   26G   0% /run/user/1149
/dev/sda1        48T   30T   16T  67% /data
tmpfs            26G     0   26G   0% /run/user/1067
tmpfs            26G   12K   26G   1% /run/user/1284
tmpfs            26G   16K   26G   1% /run/user/1003
tmpfs            26G  164K   26G   1% /run/user/1240
tmpfs            26G     0   26G   0% /run/user/1306
tmpfs            26G  4.0K   26G   1% /run/user/1358
tmpfs            26G  368K   26G   1% /run/user/1323
tmpfs            26G  160K   26G   1% /run/user/1324
tmpfs            26G  160K   26G   1% /run/user/1347
tmpfs            26G  172K   26G   1% /run/user/1341
tmpfs            26G  160K   26G   1% /run/user/1319
tmpfs            26G  160K   26G   1% /run/user/1316
tmpfs            26G  300K   26G   1% /run/user/1328
tmpfs            26G  160K   26G   1% /run/user/1365
tmpfs            26G  160K   26G   1% /run/user/1336
tmpfs            26G   92K   26G   1% /run/user/1332
tmpfs            26G  368K   26G   1% /run/user/1346
tmpfs            26G  160K   26G   1% /run/user/1339
tmpfs            26G  168K   26G   1% /run/user/1366
tmpfs            26G  168K   26G   1% /run/user/1342
tmpfs            26G   24K   26G   1% /run/user/1322
tmpfs            26G   24K   26G   1% /run/user/1348

My UID is 1322, so I think I have 26 G available space. I am done with twitter_postgres_indexes. Should I delete pg_denormalized and pg_normalized_batch I created to get more space?

mikeizbicki commented 1 month ago

The df command tells you how much disk space is free (df = disk free) on each drive. It is unrelated to quotas. du will tell you how much disk space you are using (du = disk usage). Running du -hd1 in your home folder will find folders that are using too much space so you can delete them. You only have 10GB available, and the contents of the $HOME/bigdata folder are not included in the quota.

ains-arch commented 3 weeks ago

Hi @abizermamnoon

I'm working on getting rum installed correctly and it seems like this is working. I'm curious, did you run into problems later on that necessitated the other installations?

FROM postgis/postgis

RUN apt-get update && apt-get install -y \
    less \
    make \
    vim \
    git \
    gcc \
    postgresql-server-dev-16

# install rum extensions from source
RUN cd /tmp \
 && pwd \
 && git clone https://github.com/postgrespro/rum \
 && pwd \
 && ls \
 && cd rum \
 && make USE_PGXS=1 \
 && make USE_PGXS=1 install \
 && rm -rf /tmp/rum

WORKDIR /tmp/db

RUN mkdir /data && chown postgres /data

# copy over the pagila database;
# we rename the files so that they get executed in the correct order
COPY schema.sql /docker-entrypoint-initdb.d/01.sql
abizermamnoon commented 3 weeks ago
FROM postgis/postgis
# Install system packages for building PostgreSQL extensions
RUN apt-get update && apt-get install -y \
    git \
    build-essential \
    postgresql-server-dev-all \
    postgresql-12-rum \
    less \
    make \
    vim

RUN git clone https://github.com/postgrespro/rum /tmp/rum

WORKDIR /tmp/rum
RUN make USE_PGXS=1
RUN make USE_PGXS=1 install
# Install RUM extension for PostgreSQL

WORKDIR /tmp/db

RUN mkdir /data && chown postgres /data

# copy over the pagila database;
# we rename the files so that they get executed in the correct order
COPY schema.sql /docker-entrypoint-initdb.d/01.sql