PrincetonUniversity / cpf

Collaborative Parallelization Framework (CPF)
MIT License
31 stars 4 forks source link

Make "Artifact" section from ASPLOS'20 paper work #32

Open andreybokhanko opened 3 years ago

andreybokhanko commented 3 years ago

Hi,

Sorry if this is a wrong place to submit such an issue -- in this case, let me know which one would be more appropriate.

After a lot of trial and error, I discovered that instructions from your "A. Artifact Appendix" section of ASPLO'20 paper (https://liberty.princeton.edu/Publications/asplos20_perspective.pdf) don't work.

Specifically, "docker build" command fails with the following error:

Step 3/24 : RUN apt-get install -y curl
 ---> Running in ba1c4be6231f
Reading package lists...
Building dependency tree...
Reading state information...
E: Unable to locate package curl
Removing intermediate container ba1c4be6231f
The command '/bin/sh -c apt-get install -y curl' returned a non-zero code: 100`

This can be solved by adding "RUN apt-get update" as the first RUN line; however, then we stumble on installation of required python packages:

Step 9/25 : RUN pip3 install -r /root/requirements.txt
 ---> Running in d8d51cffe719
Collecting joblib>=0.13.2 (from -r /root/requirements.txt (line 1))
  Downloading https://files.pythonhosted.org/packages/55/85/70c6602b078bd9e6f3da4f467047e906525c355a4dacd4f71b97a35d9897/joblib-1.0.1-py3-none-any.whl (303kB)
Collecting numpy>=1.16.1 (from -r /root/requirements.txt (line 2))
  Downloading https://files.pythonhosted.org/packages/82/a8/1e0f86ae3f13f7ce260e9f782764c16559917f24382c74edfb52149897de/numpy-1.20.2.zip (7.8MB)
    Complete output from command python setup.py egg_info:
    Traceback (most recent call last):
      File "<string>", line 1, in <module>
      File "/tmp/pip-build-zoermkau/numpy/setup.py", line 30, in <module>
        raise RuntimeError("Python version >= 3.7 required.")
    RuntimeError: Python version >= 3.7 required.

Apparently, a lot of packages (most crucially, scipy) require Python 3.7 as the minimum, while Ubuntu 16.04 comes with Python 3.5.

I failed to solve this problem with Ubuntu 16.04 and switched to Ubuntu 20.04 (which doesn't have gcc-5, so corresponding lines have to be removed). Dockerfile that works for me is:

FROM ubuntu:20.04

ENV DEBIAN_FRONTEND="noninteractive"
RUN apt-get update
RUN apt-get install -y curl tzdata
RUN apt-get install -y vim make gcc g++ time binutils ruby python3 python3-dev python3-pip python3-matplotlib

COPY ./requirements.txt /root/requirements.txt
RUN pip3 install -r /root/requirements.txt

RUN useradd --create-home --shell /bin/bash asplos20ae

USER asplos20ae
WORKDIR /home/asplos20ae

COPY --chown=asplos20ae ./benchmarks /home/asplos20ae/benchmarks
COPY --chown=asplos20ae ./CK /home/asplos20ae/CK
COPY --chown=asplos20ae ./exp_scripts /home/asplos20ae/exp_scripts
COPY --chown=asplos20ae ./llvm /home/asplos20ae/llvm
COPY --chown=asplos20ae ./perspective_lib /home/asplos20ae/perspective_lib
COPY --chown=asplos20ae ./README.md /home/asplos20ae/README.md
COPY --chown=asplos20ae ./reference-result.txt /home/asplos20ae/reference-result.txt
COPY --chown=asplos20ae ./reference-comparison-exp.pdf  /home/asplos20ae/reference-comparison-exp.pdf
COPY --chown=asplos20ae ./reference-scalability-exp.pdf  /home/asplos20ae/reference-scalability-exp.pdf
COPY --chown=asplos20ae ./bashrc /home/asplos20ae/bashrc
RUN cat /home/asplos20ae/bashrc >> /home/asplos20ae/.bashrc
RUN rm /home/asplos20ae/bashrc

CMD /bin/bash

Unfortunately, even with this dockerfile the final step, that is supposed to run benchmarking, doesn't work:

$ pwd
/home/asplos20ae
$ ck run artifact
CK error: [artifact] action "run" not found in module "artifact" (417c3b437caa4594)!

I'm not an expert in "ck" framework, so stumbled here.

Please kindly help me to move forward or -- preferably -- post fixed instructions somewhere.

Yours, Andrey \=== Advanced Software Technology Lab Huawei

vgene commented 3 years ago

Hi @andreybokhanko ,

We have noticed this problem and will update the Dockerfile with the proper Python dependencies soon.

In the meanwhile, if you are more interested in using CPF rather than reproducing our results in the Perspective paper, I highly encourage you to follow the instructions here bootstrap and build the most up-to-date CPF and its dependencies from scratch. Then in run make compare.out under tests/regression/*/src to generate both the parallelized binary and the sequential binary and run the performance test. The time will stored at parallel.time and seq.time respectively. Currently, 2mm, 3mm, covariance, and correlation should work.

andreybokhanko commented 3 years ago

Hi @vgene ,

Thanks for the quick response!

Yes, ultimately I'm interested in actually using CPF -- I wanted to check your Perspective results to verify that they work in my environment / use case. Unfortunately, master branch of cpf also fails in my environment (Ubuntu 20.04, gcc 9.3.0) on building llvm step. I used unmodified Makefile.example file from your bootstrap directory.

Perhaps there is a problem in my environment. Could you, please, provide a Docker file for an environment that successfully builds current CPF master?

Yours, Andrey

vgene commented 3 years ago

To Compile

Dockerfile for the Head

A Dockerfile for the current environment is a great idea and we will make sure it's available by the end of this week.

Fixed Dockerfile for Perspective Artifact

A fix for the Perspective paper artifact is available, @gchan510 will post it today

gchan510 commented 3 years ago

Please use this Dockerfile, it contains updates for Python 3.8.

FROM ubuntu:16.04

RUN apt-get update --fix-missing
RUN apt-get install -y curl
RUN apt-get install -y vim make gcc-5 g++-5 time binutils ruby software-properties-common

RUN update-alternatives --install /usr/bin/gcc gcc /usr/bin/gcc-5 10
RUN update-alternatives --install /usr/bin/g++ g++ /usr/bin/g++-5 10

# Install more up-to-date python
RUN add-apt-repository ppa:deadsnakes/ppa
RUN apt-get update
RUN apt-get install -y python3.8 python3.8-dev python3.8-venv
RUN update-alternatives --install /usr/bin/python3 python3 /usr/bin/python3.8 10
RUN python3 -m ensurepip
RUN pip3 install --upgrade pip

COPY ./requirements.txt /root/requirements.txt
RUN pip3 install -r /root/requirements.txt

RUN useradd --create-home --shell /bin/bash asplos20ae

USER asplos20ae
WORKDIR /home/asplos20ae

COPY --chown=asplos20ae ./benchmarks /home/asplos20ae/benchmarks
COPY --chown=asplos20ae ./CK /home/asplos20ae/CK
COPY --chown=asplos20ae ./exp_scripts /home/asplos20ae/exp_scripts
COPY --chown=asplos20ae ./llvm /home/asplos20ae/llvm
COPY --chown=asplos20ae ./perspective_lib /home/asplos20ae/perspective_lib
COPY --chown=asplos20ae ./README.md /home/asplos20ae/README.md
COPY --chown=asplos20ae ./reference-result.txt /home/asplos20ae/reference-result.txt
COPY --chown=asplos20ae ./reference-comparison-exp.pdf  /home/asplos20ae/reference-comparison-exp.pdf
COPY --chown=asplos20ae ./reference-scalability-exp.pdf  /home/asplos20ae/reference-scalability-exp.pdf
COPY --chown=asplos20ae ./bashrc /home/asplos20ae/bashrc
RUN cat /home/asplos20ae/bashrc >> /home/asplos20ae/.bashrc
RUN rm /home/asplos20ae/bashrc

CMD /bin/bash

The requirements.txt also need a change. Please use this updated one:

joblib>=0.13.2
numpy>=1.16.1
ptyprocess>=0.6.0
termcolor>=1.1.0
requests>=2.21.0
GitPython>=2.1.11
ck==1.11.4
pandas>=0.22
psutil
scipy
matplotlib
andreybokhanko commented 3 years ago

@vgene , @gchan510 , thank you for providing the fixes in such a short time!

I'm going on a two-weeks vacation; will try your fixes after returning from the vacation in May -- and report results here.

Yours, Andrey

andreybokhanko commented 3 years ago

I tried the dockerfile that @gchan510 posted (along with requirements.txt updates) -- it works fine, thanks!

I suggest to keep this issue open until the updated dockerfile + requirement.txt would be committed and updated instructions for artifacts reproduction be published.

vgene commented 3 years ago

@andreybokhanko Awesome! Also note that there's a dockerfile now in the master branch. You can try to use that to build the latest CPF.

andreybokhanko commented 3 years ago

In the meanwhile, if you are more interested in using CPF rather than reproducing our results in the Perspective paper, I highly encourage you to follow the instructions here bootstrap and build the most up-to-date CPF and its dependencies from scratch. Then in run make compare.out under tests/regression/*/src to generate both the parallelized binary and the sequential binary and run the performance test. The time will stored at parallel.time and seq.time respectively. Currently, 2mm, 3mm, covariance, and correlation should work.

Unfortunately, this fails for me for current ToT version (git sha1 is a5d5a4398b6c68478848ff4949fc4e0b6f0be45b).

Build finishes successfully, but executing 2mm and 3mm leads to core dump (on benchmark.collaborative.exe execution step), dijkstra-dynsize fails to compile (it crashes on noelle step), both covariance and correlation fail to link. I used make benchmark.compare.out command for all of the tests.

vgene commented 3 years ago

Thank you for the feedback! Were you using the Dockerfile? It seems like a setup issue, the latest regression is passing https://github.com/PrincetonUniversity/cpf/runs/2725055202?check_suite_focus=true. We will double check the compilation inside docker.

Could you give a brief summary of the setup including OS version, gcc and glibc version?

andreybokhanko commented 3 years ago

Yes, this is from a container using the Dockerfile you supplied -- so this is Ubuntu 20.04, gcc 9.3.0, glibc 2.31, etc.