Closed jemrobinson closed 2 years ago
From @richardclegg:
This set of commands runs raphtory when an internet connection is present. (But I guess cannot be tested without.)
sudo apt-get update sudo apt -y install scala echo "deb https://dl.bintray.com/sbt/debian /" | sudo tee -a /etc/apt/sources.list.d/sbt.list curl -sL "https://keyserver.ubuntu.com/pks/lookup?op=get&search=0x2EE0EA64E40A89B84B2DF73499E82A75642AC823" | sudo apt-key add sudo apt-get update sudo apt-get -y install sbt sudo apt-get -y install git git clone https://github.com/Raphtory/Raphtory.git cd Raphtory git checkout citation sbt run com.raphtory.test.allcommands.AllCommandsTest
Tested on a recently-built compute VM image with internet access (see below). A couple of notes
git checkout citation
does not work (this does not seem to be a valid branchsbt run com.raphtory.test.allcommands.AllCommandsTest
downloads a lot of packages from repo1.maven.org
which will not work in an offline environmentMaybe @richardclegg has thoughts on how to get this to work inside an SRE?
@miratepuffin can you comment on this?
Hi @jemrobinson,
I apologise about the branch, we have just published a new version and merged everything from citation and removed. Either master or dev would be great.
For sbt, the packages downloaded from maven are the java packages required to run Raphtory. Once it has been run once these would be fully downloaded and this wouldn't need to run again. Everything on Maven is scrupulously investigated before it is added (trust me from someone that is trying to get there jars up there :D ) and is the main repo for java projects. Would this be ok to run first on your side for the final env? If not happy to hop on call or discuss this further to find a solution.
Is there a command we could run at deployment time that would download all the dependencies?
I believe this is what sbt run com.raphtory.test.allcommands.AllCommandsTest does
The alternative to this would be sbt clean assembly
, which will build the fat jar with all packages. But either works :)
@miratepuffin I think you are saying that if they deploy the safe haven just with these packages then we could make it work at our end?
What I am saying is you need to run sbt clean assembly
to pull the required jars from maven so that Raphtory will run. It will only need to be run once as those packages will then be available in the /target folder. After that no more internet connection is required.
i.e. after running the following you can disconnect the internet and we will be ready to go
sudo apt -y install scala
echo "deb https://dl.bintray.com/sbt/debian /" | sudo tee -a /etc/apt/sources.list.d/sbt.list
curl -sL "https://keyserver.ubuntu.com/pks/lookup?op=get&search=0x2EE0EA64E40A89B84B2DF73499E82A75642AC823" | sudo apt-key add
sudo apt-get update
sudo apt-get -y install sbt
sudo apt-get -y install git
git clone https://github.com/Raphtory/Raphtory.git
cd Raphtory
sbt clean assembly
Do you need any more information here? I think this should be all you need to do to get it working?
Sorry that last was for @jemrobinson
Note that the following commands are run as root since there are no other users at this point
root> cd /scratch
root> git clone https://github.com/Raphtory/Raphtory.git
root> cd Raphtory && sbt clean assembly
root> chmod -R go=u /scratch/Raphtory
This seems to complete successfully without any obvious errors.
After deploying the VM I then run the following as a non-privileged user:
> cd /scratch/Raphtory
> sbt run com.raphtory.test.allcommands.AllCommandsTest
This then crashes with the following error:
Any ideas what else to try @miratepuffin ?
@miratepuffin Looks like those repos do not exist? What's happening here?
From the error it seems the machine at build time has no access to https://repo1.maven.org where the dependencies are hosted. @jemrobinson is this step being run already inside the safe haven? We were intending for this to be possible when the image is created, but there would be no Internet required after this.
Can you confirm that at no point the machine will be able to connect to Maven servers? If that is impossible the alternative is we copy these same jar packages to the git repo, and configure sbt so that it finds these files. This is a bit of a hack, but certainly doable.
Sorry, I wasn't clear earlier @felixcdr. I'll update my post above. Essentially, there is internet at build time, but even after running sbt clean assembly
at build time, the package still tries to connect to Maven at run time. Is there a way to force it not to attempt this connection?
I think I have figured out the issue @jemrobinson . I tried on a local Docker container and sbt does not require Internet when the dependencies are there, although there is a specific flag for running it in offline mode that we can apply (sbt "set offline := true" run. The culprit if I understand correctly your logs is that these are two different machines / user spaces, and the scratch folder is where all the information is being copied from one to another. In the first machine, all dependencies are downloaded to the ivy2 cache, which by default is located in ~/.ivy2/cache . But when running the code in the safe environment, I suspect this folder does not exist.
The solution I think would be to copy all the downloaded dependencies to the scratch folder space, and specify the ivy location when testing sbt:
AT COMPILE TIME
root> cd /scratch
root> git clone https://github.com/Raphtory/Raphtory.git
root> cd Raphtory && sbt clean assembly
root> mkdir ivy2
root> cp -R ~/.ivy2/cache /scratch/Raphtory/ivy2
root> chmod -R go=u /scratch/Raphtory
AT TESTING TIME
cd /scratch/Raphtory
sbt -Dsbt.ivy.home=/scratch/Raphtory/ivy2 run com.raphtory.test.allcommands.AllCommandsTest
My commands might be a bit off, but hopefully this works.
OK, this gets a bit further now but still crashes a bit later on. I'm getting the following runtime output:
> sbt -Dsbt.ivy.home=/opt/sbt/ivy run com.raphtory.test.allcommands.AllCommandsTest
copying runtime jar...
[info] [launcher] getting org.scala-sbt sbt 1.3.8. (this may take some time)
:: loading settings :: url = jar:file:/usr/share/sbt/bin/sbt-launch.jar!/org/apache/ivy/core/settings/ivysettings.xml
:: loading settings :: url = jar:file:/usr/share/sbt/bin/sbt-launch.jar!/org/apache/ivy/core/settings/ivysettings.xml
:: retrieving :: org.scala-sbt#boot-app
confs: [default]
81 artifacts copied, 0 already retrieved
[info] [launcher] getting Scala 2.12.10 (for sbt)...
:: retrieving :: org.scala-sbt#boot-scala
confs: [default]
6 artifacts copied, 0 already retrieved
[info] Loading settings for project raphtory-build from plugins.sbt ...
[info] Loading project definition from /scratch/Raphtory/project
[info] Updating
[info] Resolved dependencies
[warn]
[warn] Note: Some unresolved dependencies have extra attributes. Check that these dependencies exist with the requested attributes.
[warn] org.scalameta:sbt-scalafmt:2.3.2 (sbtVersion=1.0, scalaVersion=2.12)
[warn] com.lightbend.sbt:sbt-javaagent:0.1.5 (sbtVersion=1.0, scalaVersion=2.12)
[warn] com.eed3si9n:sbt-assembly:0.14.9 (sbtVersion=1.0, scalaVersion=2.12)
[warn] com.typesafe.sbt:sbt-native-packager:1.3.1 (sbtVersion=1.0, scalaVersion=2.12)
[warn]
[warn] Note: Unresolved dependencies path:
[error] sbt.librarymanagement.ResolveException: Error downloading org.scalameta:sbt-scalafmt;sbtVersion=1.0;scalaVersion=2.12:2.3.2
[error] Not found
[error] Not found
[error] download error: Caught java.net.UnknownHostException: repo1.maven.org (repo1.maven.org) while downloading https://repo1.maven.org/maven2/org/scalameta/sbt-scalafmt_2.12_1.0/2.3.2/sbt-scalafmt-2.3.2.pom
... so it looks like some packages have been pre-downloaded but others haven't. Would running run com.raphtory.test.allcommands.AllCommandsTest
at build time help with this @felixcdr @miratepuffin ?
No, that's not expected :( These lines show that sbt has managed to copy the base executable files from the ivy2 folder. It has also found the project-specific configuration, as it is getting the version of sbt specified in build.properties.
Can you run the same command with the -v flag to see if debug gives some more info?
For reference this is what I see on a container with no Internet, having deleted ~/.sbt and ~/.ivy2. It shows the same output, and the goes to loading settings for project...
[info] [launcher] getting org.scala-sbt sbt 1.3.8 (this may take some time)...
:: loading settings :: url = jar:file:/usr/share/sbt/bin/sbt-launch.jar!/org/apache/ivy/core/settings/ivysettings.xml
:: loading settings :: url = jar:file:/usr/share/sbt/bin/sbt-launch.jar!/org/apache/ivy/core/settings/ivysettings.xml
:: loading settings :: url = jar:file:/usr/share/sbt/bin/sbt-launch.jar!/org/apache/ivy/core/settings/ivysettings.xml
:: loading settings :: url = jar:file:/usr/share/sbt/bin/sbt-launch.jar!/org/apache/ivy/core/settings/ivysettings.xml
:: loading settings :: url = jar:file:/usr/share/sbt/bin/sbt-launch.jar!/org/apache/ivy/core/settings/ivysettings.xml
:: loading settings :: url = jar:file:/usr/share/sbt/bin/sbt-launch.jar!/org/apache/ivy/core/settings/ivysettings.xml
:: loading settings :: url = jar:file:/usr/share/sbt/bin/sbt-launch.jar!/org/apache/ivy/core/settings/ivysettings.xml
:: loading settings :: url = jar:file:/usr/share/sbt/bin/sbt-launch.jar!/org/apache/ivy/core/settings/ivysettings.xml
:: loading settings :: url = jar:file:/usr/share/sbt/bin/sbt-launch.jar!/org/apache/ivy/core/settings/ivysettings.xml
:: loading settings :: url = jar:file:/usr/share/sbt/bin/sbt-launch.jar!/org/apache/ivy/core/settings/ivysettings.xml
:: loading settings :: url = jar:file:/usr/share/sbt/bin/sbt-launch.jar!/org/apache/ivy/core/settings/ivysettings.xml
:: loading settings :: url = jar:file:/usr/share/sbt/bin/sbt-launch.jar!/org/apache/ivy/core/settings/ivysettings.xml
:: loading settings :: url = jar:file:/usr/share/sbt/bin/sbt-launch.jar!/org/apache/ivy/core/settings/ivysettings.xml
:: loading settings :: url = jar:file:/usr/share/sbt/bin/sbt-launch.jar!/org/apache/ivy/core/settings/ivysettings.xml
:: loading settings :: url = jar:file:/usr/share/sbt/bin/sbt-launch.jar!/org/apache/ivy/core/settings/ivysettings.xml
:: loading settings :: url = jar:file:/usr/share/sbt/bin/sbt-launch.jar!/org/apache/ivy/core/settings/ivysettings.xml
:: retrieving :: org.scala-sbt#boot-app
confs: [default]
81 artifacts copied, 0 already retrieved
[info] [launcher] getting Scala 2.12.10 (for sbt)...
:: retrieving :: org.scala-sbt#boot-scala
confs: [default]
6 artifacts copied, 0 already retrieved
[info] Loading settings for project raphtory-build from plugins.sbt ...
So I think @jemrobinson that the issue is files from the repositories built in local directories by our install are not getting copied to the safe haven environment. Felix's idea of adding -v to the sbt command seems the best way forward to collect more info.
Sorry, didn't see this reply (thanks for tagging me @richardclegg - that notifies me that there's been activity on the thread). I'll take a look later this week.
@jemrobinson Suspected that might be the case. I imagine you get a swarm of such things to deal with.
@jemrobinson Little nudge here. It would be great to get this going.
I'm looking at this issue today/tomorrow (sorry for the delay!).
@jemrobinson Let me know if you need more info. Keen to resolve this.
It looks to me like the problem might be with sbt
itself. Is it trying to update itself? Any thoughts @felixcdr @miratepuffin ?
Note the command should be: sbt -v -Dsbt.ivy.home=/opt/sbt/ivy run com.raphtory.test.allcommands.AllCommandsTest
Had a hack session with @felixcdr and @miratepuffin. Nothing was obviously wrong, but the sbt
process was hanging.
@felixcdr will investigate using a VM with the same Ubuntu version (18.04.5
). For reference, the Safe Haven VM was using
sbt 1.4.9
scala 2.11.12
javac 11.0.10
Hi @jemrobinson, @felixcdr have been testing this on a container of exactly the same linux version we were working with during the hack session and everything seems to be working correctly. This includes the moving of files and cutting of network connection. Would you be up for a second quick hack together to try this on a fresh azure instance (non safe haven initially) where we reproduce the safe haven restrictions to see if we can pin down the issue.
Yes, that's fine by me. I should have some time for this next week. Shall we fix a time by email?
Sorry for the delay @jemrobinson, I didn't get the notification, that sounds great! Will drop you an email with Felix in the morning.
Had a hack session with @felixcdr and @miratepuffin where we looked at trying to set this up on a fresh Ubuntu VM with networking turned on for the initial stages and then turned off afterwards
WITH NETWORKING ON
root> cd /scratch
root> git clone https://github.com/Raphtory/Raphtory.git
root> cd Raphtory && sbt clean assembly
root> sbt "runMain com.raphtory.dev.allcommands.AllCommandTest"
root> mkdir ivy2
root> cp -R ~/.ivy2/cache /scratch/Raphtory/ivy2
root> cp -R ~/.sbt /scratch/Raphtory/.sbt
root> chmod -R go=u /scratch/Raphtory
WITH NETWORKING OFF
cd /scratch/Raphtory
cp -R /scratch/Raphtory/.sbt ~/
sbt "set offline := true" -Dsbt.ivy.home=/scratch/Raphtory/ivy2 run com.raphtory.dev.allcommands.AllCommandsTest
this did not work either - the program still attempts to download dependencies on first run
[info] Loading settings for project raphtory-build from plugins.sbt ...
[info] Loading project definition from /scratch/Raphtory/project
[info] Updating
| => raphtory-build / update 3s
https://repo1.maven.org/maven2/org/scala-lang/scala-library/2.12.10/scala-library-2.12.10.pom
0.0% [ ] 0 B (0 B / s)
https://repo1.maven.org/maven2/com/eed3si9n/sbt-assembly_2.12_1.0/0.14.9/sbt-assembly-0.14.9.pom
0.0% [ ] 0 B (0 B / s)
https://repo1.maven.org/maven2/com/typesafe/sbt/sbt-native-packager_2.12_1.0/1.3.1/sbt-native-packager-1.3.1.pom
0.0% [ ] 0 B (0 B / s)
Closing due to lack of activity in the past 6 months.
:scroll: Description
The Raphtory project would like
zsh
andsbt
added to the DSVM. Bothzsh
andsbt
are installed and functional on the DSVM butsbt
attempts to connect to the internet on startup:strawberry: Desired behaviour