Why is there an error when compiling dynet

PhyllisJi commented 3 years ago

There are always various errors where cuda iibriary is not being targeted. What version of CUDA should I use

kamigaito commented 3 years ago

Thank you for trying to run our code! I used CUDA10.1 to compile Dynet. Besides the specification of CUDA libraries, you need to set BOOST_ROOT before setting up Eigen and Dynet, and also need to add the library location to LD_LIBRARY_PATH and CPLUS_INCLUDE_PATH.

PhyllisJi commented 3 years ago

Thank you for trying to run our code! I used CUDA10.1 to compile Dynet. Besides the specification of CUDA libraries, you need to set BOOST_ROOT before setting up Eigen and Dynet, and also need to add the library location to LD_LIBRARY_PATH and CPLUS_INCLUDE_PATH.

Thank you for your answer! I have solved this problem. But I have a new issue, after I run google/predict.sh ,why I cannot obtain the comp.txt? I only have the following files in "/models/modelname_process_id/", and I didn't have "/{dataset size} " in "/models":

dev.info
save_epoch_17.model.gz
test_result_greedy.dists.att.0
dev.label
save_epoch_17.optimizer.gz test_result_greedy.dists.out.0
dev_17.txt
test.all
test_result_greedy.probs
dict_src_char.txt test.info
test_result_greedy.sents
dict_src_word.txt
test.long.all
options.txt
test.long.info

kamigaito commented 3 years ago

Sorry for the inconvenient file names. test_result_greedy.sents includes 1: keep or 0: delete labels of each token for each input sentence like that:

<s> 1 1 1 1 1 1 1 0 0 0 1 1 1 0 0 0 0 0 0 0 1 </s>
<s> 0 1 1 0 0 0 1 1 1 0 0 0 1 0 0 0 0 0 0 0 0 0 0 1 </s>
<s> 1 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 </s>

Therefore, you can obtain compressed sentences by picking tokens that have label 1 from the input sentences. Please be careful about the start and end of sentence symbols.

PhyllisJi commented 3 years ago

Sorry for the inconvenient file names. test_result_greedy.sents includes 1: keep or 0: delete labels of each token for each input sentence like that:
<s> 1 1 1 1 1 1 1 0 0 0 1 1 1 0 0 0 0 0 0 0 1 </s>
<s> 0 1 1 0 0 0 1 1 1 0 0 0 1 0 0 0 0 0 0 0 0 0 0 1 </s>
<s> 1 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 </s>
Therefore, you can obtain compressed sentences by picking tokens that have label 1 from the input sentences. Please be careful about the start and end of sentence symbols.

OK, I see. Thank you. There's one more thing I want to confirm with you. If I want to use these models to compress my own prepared sentences, do I follow these steps：

conver to clean sentences
extra bert, elmo and glove features
then use this command： time ${BIN} \ --mode predict \ --rootdir ${ROOTDIR} \ --modelfile <(gzip -dc ${ROOTDIR}/saveepoch${max_id}.model.gz) \ --srcfile ${DATADIR}/file.sent \ --trgfile ${ROOTDIR}/file_result_greedy \ --alignfile ${DATADIR}/file.cln.dep \ --elmo_hdf5_files ${DATADIR}/file.cln.strip.sent.glove.hdf5,${DATADIR}/file.cln.strip.sent.elmo.hdf5,${DATADIR}/file.cln.strip.sent.bert.hdf5 \ --elmo_hdf5_dims 300,1024,768 \ --elmo_hdf5_layers 1,3,12 \ --guided_alignment 1 \ --max_batch_pred 16 \ --sort_sent_type_pred "sort_default" \ --batch_type_pred "same_length" \ --shuffle_batch_type_pred "default" \ --decoder_type "greedy" \ --beam_size 1
Please tell me, is this correct？

kamigaito commented 3 years ago

Yes, your understanding is almost correct. In addition to them, you also need to prepare a dependency file like ${DATADIR}/file.cln.dep in your command. This is an example of the format:

${DATADIR}/file.cln.dep
0-0 8-1 8-2 6-3 6-4 6-5 8-6 6-7 0-8 11-9 11-10 8-11 14-12 14-13 11-14 14-15 17-16 14-17 14-18 21-19 21-20 14-21 11-22 24-23 22-24 0-25
0-0 2-1 5-2 5-3 5-4 14-5 5-6 6-7 6-8 8-9 8-10 6-11 6-12 14-13 0-14 14-15 15-16 14-17 0-18

However, it is not actually used in the prediction step excluding the model LSTM-Dep in our paper. Thus, you can use an arbitral file containing the same number of tokens that indicate a pair of numbers to the input text as the dependency file.

PhyllisJi commented 3 years ago

Yes, your understanding is almost correct. In addition to them, you also need to prepare a dependency file like ${DATADIR}/file.cln.dep in your command. This is an example of the format:
${DATADIR}/file.cln.dep
0-0 8-1 8-2 6-3 6-4 6-5 8-6 6-7 0-8 11-9 11-10 8-11 14-12 14-13 11-14 14-15 17-16 14-17 14-18 21-19 21-20 14-21 11-22 24-23 22-24 0-25
0-0 2-1 5-2 5-3 5-4 14-5 5-6 6-7 6-8 8-9 8-10 6-11 6-12 14-13 0-14 14-15 15-16 14-17 0-18
However, it is not actually used in the prediction step excluding the model LSTM-Dep in our paper. Thus, you can use an arbitral file containing the same number of tokens that indicate a pair of numbers to the input text as the dependency file.

Do I need to prepare a pos file and a rel file?

kamigaito commented 3 years ago

Do I need to prepare a pos file and a rel file?

These files are unnecessary for running models. The current script extracts POS and relation labels from the Google sentence compression dataset. However, these files are not actually used in the models.

PhyllisJi commented 3 years ago

Do I need to prepare a pos file and a rel file?

These files are unnecessary for running models. The current script extracts POS and relation labels from the Google sentence compression dataset. However, these files are not actually used in the models.

Thank you for your answers! I have successfully compressed my own dataset!

SriramPingali commented 3 years ago

Thank you for trying to run our code! I used CUDA10.1 to compile Dynet. Besides the specification of CUDA libraries, you need to set BOOST_ROOT before setting up Eigen and Dynet, and also need to add the library location to LD_LIBRARY_PATH and CPLUS_INCLUDE_PATH.

Hey! Can you please tell me where to find the boost libraries? I did the bootstrapping process to download the libs from the zip file. Although I can't seem to find the folder containing "include" and "lib" subfolders as required by the compiler.

kamigaito / SLAHAN

Why is there an error when compiling dynet #3