Closed Ask-sola closed 1 year ago
I saw in the document that parameters can be generated and attempted through the following methods: And I saw two new folders: "parameters sf1" and "update streams sf1", but these two folders are still empty:
Hi!
To run the Interactive v2 benchmark, from the ldbc_snb_interactive_impls
directory, first run scripts/install-dependencies.sh
in the ldbc_snb_interactive_driver
directory. This will install the required dependencies for the parameter generator later. (paramgen is part of the driver repository)
Next, to create the dataset, you can run the example as shown in your screenshot:
export SF=1 #The scale factor to generate
export LDBC_SNB_DATAGEN_DIR=~/repositories/ldbc_snb_datagen_spark # Path to the LDBC SNB datagen directory
export LDBC_SNB_DATAGEN_MAX_MEM=8G #Maximum memory the datagen could use, e.g. 16G
export LDBC_SNB_DRIVER_DIR=~/repositories/ldbc_snb_interactive_driver # Path to the LDBC SNB driver directory
export DATA_INPUT_TYPE=parquet
# If using the Docker Datagen version, set the env variable:
export USE_DATAGEN_DOCKER=true
scripts/generate-all.sh
Make sure to change the LDBC_SNB_DATAGEN_DIR
and LDBC_SNB_DRIVER_DIR
to point to the directories on your machine (or clone the repositories first)
Then, run the scripts/generate-all.sh
script in the ldbc_snb_interactive_impls
directory. This will create the folders in the directory. So, for example, when generating using SF=1
, you will get two directories in the ldbc_snb_interactive_impls
directory: update-streams-sf1
, containing the update streams for the insert and deletes, and parameters-sf1
, containing the substitution parameters. After generation, you can update the benchmark.properties
file for Postgres (assuming using SF1 here):
ldbc.snb.interactive.scale_factor=1
ldbc.snb.interactive.updates_dir=~/repositories/ldbc_snb_interactive_impls/update-streams-sf1/
ldbc.snb.interactive.parameters_dir=~/repositories/ldbc_snb_interactive_impls/parameters-sf1/
Note to update the paths to where your copy of the ldbc_snb_interactive_impls
is.
Thank you very much for your answer, but when I ran 'scripts/generate all. sh', the following problem still occurred:
I found the following steps by checking 'scripts/generate all. sh': However, I originally thought that the "LDBC_SNB-DRIVER-DIR" folder was an empty folder I created myself and would automatically generate its contents through "scripts/generate all. sh". From here, it seems that it is not. So how should I make sure that this file contains the "scripts" I need? This step is stuck, resulting in the inability to generate 'update' and 'parameter'
Hi,
First, you need to clone https://github.com/ldbc/ldbc_snb_interactive_driver, then you should point the LDBC_SNB_DRIVER_DIR that directory. Assuming you are cloning it in your home folder in a directory called ldbc
, it would look like this:
export LDBC_SNB_DRIVER_DIR=~/ldbc/ldbc_snb_interactive_driver
When cloning that repository, you will find the required scripts.
Same for LDBC_SNB_DATAGEN_DIR; clone https://github.com/ldbc/ldbc_snb_datagen_spark and point to that directory:
export LDBC_SNB_DRIVER_DIR=~/ldbc/ldbc_snb_datagen_spark
Then using the previous commands, it should work. Note that in the ldbc_snb_interactive_driver
there is a scripts/install-dependencies.sh
you should run. Afterwards, navigate to the ldbc_snb_interactive_impls
to execute the scripts/generate-all.sh
Thank you very much for your reply. I have successfully completed create parameter so far
Hello, I am running the implementation of Postgre and have performed the following steps:
Firstly, I built the project:
postgres/scripts/build.sh
Next, I built a dataset using Docker and obtained three folders for input, delete, and initial_ Snapshot, the dataset storage location is/home/hay010618/ldbc/test1/out sf1/graphs/csv/bi/composite-merged-fk:
docker run --mount type=bind,source="$(pwd)/test1",target=/out ldbc/datagen-standalone:0.5.1-2.12_spark3.2 --parallelism 1 -- --format csv --scale-factor 1 --mode bi --format-options compression=gzip
Next, I load the dataset:
Export POSTGRES_ CSV_ DIR=~/ldbc/test1/out sf1/graphics/csv/bi/composite-merged-fk
and run scripts/load in one step.sh,It runs smoothly until here。However, when I run driver/create validation parameters. sh, it displays as
Error loading Workload class
: I see this because there is no parquet file needed in the empty folder that needs to be read:I found this issue:https://github.com/ldbc/ldbc_snb_interactive_impls/issues/322 However, I still don't understand how to change 'ldbc. snb. interactive. updatesdir' and 'ldbc. snb. interactive. parameters dir ", will using the default parameters here result in the inability to run? Is the required parquet file obtained through the previous dataset generation or will it be generated through" driver/create validation parameters. sh "? If it is the former, I generated files in CSV mode and did not find the parameter folder for parquet. If it is the latter, did I not generate the required parquet files in the corresponding folder, which caused this problem to occur? This makes me very confused