kamigaito / SLAHAN

SLAHAN is an implementation of Kamigaito et al., 2020, "Syntactically Look-A-Head Attention Network for Sentence Compression", In Proc. of AAAI2020.
MIT License
19 stars 3 forks source link

Why is there an error when compiling dynet #3

Closed PhyllisJi closed 3 years ago

PhyllisJi commented 3 years ago

There are always various errors where cuda iibriary is not being targeted. What version of CUDA should I use

kamigaito commented 3 years ago

Thank you for trying to run our code! I used CUDA10.1 to compile Dynet. Besides the specification of CUDA libraries, you need to set BOOST_ROOT before setting up Eigen and Dynet, and also need to add the library location to LD_LIBRARY_PATH and CPLUS_INCLUDE_PATH.

PhyllisJi commented 3 years ago

Thank you for trying to run our code! I used CUDA10.1 to compile Dynet. Besides the specification of CUDA libraries, you need to set BOOST_ROOT before setting up Eigen and Dynet, and also need to add the library location to LD_LIBRARY_PATH and CPLUS_INCLUDE_PATH.

Thank you for your answer! I have solved this problem. But I have a new issue, after I run google/predict.sh ,why I cannot obtain the comp.txt? I only have the following files in "/models/modelname_process_id/", and I didn't have "/{dataset size} " in "/models":

kamigaito commented 3 years ago

Sorry for the inconvenient file names. test_result_greedy.sents includes 1: keep or 0: delete labels of each token for each input sentence like that:

<s> 1 1 1 1 1 1 1 0 0 0 1 1 1 0 0 0 0 0 0 0 1 </s>
<s> 0 1 1 0 0 0 1 1 1 0 0 0 1 0 0 0 0 0 0 0 0 0 0 1 </s>
<s> 1 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 </s>

Therefore, you can obtain compressed sentences by picking tokens that have label 1 from the input sentences. Please be careful about the start and end of sentence symbols.

PhyllisJi commented 3 years ago

Sorry for the inconvenient file names. test_result_greedy.sents includes 1: keep or 0: delete labels of each token for each input sentence like that:

<s> 1 1 1 1 1 1 1 0 0 0 1 1 1 0 0 0 0 0 0 0 1 </s>
<s> 0 1 1 0 0 0 1 1 1 0 0 0 1 0 0 0 0 0 0 0 0 0 0 1 </s>
<s> 1 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 </s>

Therefore, you can obtain compressed sentences by picking tokens that have label 1 from the input sentences. Please be careful about the start and end of sentence symbols.

OK, I see. Thank you. There's one more thing I want to confirm with you. If I want to use these models to compress my own prepared sentences, do I follow these steps:

kamigaito commented 3 years ago

Yes, your understanding is almost correct. In addition to them, you also need to prepare a dependency file like ${DATADIR}/file.cln.dep in your command. This is an example of the format:

${DATADIR}/file.cln.dep
0-0 8-1 8-2 6-3 6-4 6-5 8-6 6-7 0-8 11-9 11-10 8-11 14-12 14-13 11-14 14-15 17-16 14-17 14-18 21-19 21-20 14-21 11-22 24-23 22-24 0-25
0-0 2-1 5-2 5-3 5-4 14-5 5-6 6-7 6-8 8-9 8-10 6-11 6-12 14-13 0-14 14-15 15-16 14-17 0-18

However, it is not actually used in the prediction step excluding the model LSTM-Dep in our paper. Thus, you can use an arbitral file containing the same number of tokens that indicate a pair of numbers to the input text as the dependency file.

PhyllisJi commented 3 years ago

Yes, your understanding is almost correct. In addition to them, you also need to prepare a dependency file like ${DATADIR}/file.cln.dep in your command. This is an example of the format:

${DATADIR}/file.cln.dep
0-0 8-1 8-2 6-3 6-4 6-5 8-6 6-7 0-8 11-9 11-10 8-11 14-12 14-13 11-14 14-15 17-16 14-17 14-18 21-19 21-20 14-21 11-22 24-23 22-24 0-25
0-0 2-1 5-2 5-3 5-4 14-5 5-6 6-7 6-8 8-9 8-10 6-11 6-12 14-13 0-14 14-15 15-16 14-17 0-18

However, it is not actually used in the prediction step excluding the model LSTM-Dep in our paper. Thus, you can use an arbitral file containing the same number of tokens that indicate a pair of numbers to the input text as the dependency file.

Do I need to prepare a pos file and a rel file?

kamigaito commented 3 years ago

Do I need to prepare a pos file and a rel file?

These files are unnecessary for running models. The current script extracts POS and relation labels from the Google sentence compression dataset. However, these files are not actually used in the models.

PhyllisJi commented 3 years ago

Do I need to prepare a pos file and a rel file?

These files are unnecessary for running models. The current script extracts POS and relation labels from the Google sentence compression dataset. However, these files are not actually used in the models.

Thank you for your answers! I have successfully compressed my own dataset!

SriramPingali commented 3 years ago

Thank you for trying to run our code! I used CUDA10.1 to compile Dynet. Besides the specification of CUDA libraries, you need to set BOOST_ROOT before setting up Eigen and Dynet, and also need to add the library location to LD_LIBRARY_PATH and CPLUS_INCLUDE_PATH.

Hey! Can you please tell me where to find the boost libraries? I did the bootstrapping process to download the libs from the zip file. Although I can't seem to find the folder containing "include" and "lib" subfolders as required by the compiler.