Cynwell / Text-Level-GNN

Text Level Graph Neural Network for Text Classification
Apache License 2.0
46 stars 13 forks source link

running the project #4

Closed Abdullahi-Sherko closed 2 years ago

Abdullahi-Sherko commented 2 years ago

hi, I tried to run the implementation on colab but I got an error. is it possible to run it on colab?

Cynwell commented 2 years ago

It should be fine to run on Colab as well. What error messaged have you encountered?

Abdullahi-Sherko commented 2 years ago

Hi, I appreciate your answering. it's very important for me to run this implementation. I paste the error here.

train.py:35: VisibleDeprecationWarning: Creating an ndarray from ragged nested sequences (which is a list-or-tuple of lists-or-tuples-or ndarrays with different lengths or shapes) is deprecated. If you meant to do this, you must specify 'dtype=object' when creating the ndarray self.embedding_matrix = np.array(self.embedding_matrix).astype(np.float32) # Convert if from double to float for efficiency TypeError: float() argument must be a string or a number, not 'list'

The above exception was the direct cause of the following exception:

Traceback (most recent call last): File "train.py", line 317, in tokenizer = GloveTokenizer(f'embeddings/glove.6B.{args.embedding_size}d.txt') File "train.py", line 35, in init self.embedding_matrix = np.array(self.embedding_matrix).astype(np.float32) # Convert if from double to float for efficiency ValueError: setting an array element with a sequence.

Cynwell commented 2 years ago

I tried to create a virtual environment with Python = 3.8, Pytorch = 1.9.1, Numpy = 1.21.2 (newest version) and Pandas = 1.3.4 to run my codes. It worked well. Would you like to run pip install numpy --upgrade in your environment before executing the codes? You might want to use pip show numpy to check whether Numpy is up-to-date too.

Commands to create a virtual environment:

conda create -n python38_numpy121 python=3.8 -y
conda activate python38_numpy121
conda install pytorch torchvision torchaudio cudatoolkit=11.1 -c pytorch -c nvidia -y
pip install pandas
Abdullahi-Sherko commented 2 years ago

Thank you for giving me your precious time. I tried your solution on colab but it did not work. I faced with upon error again.

pandas-1.3.4 numpy-1.21.2 python-3.7.2 pytorch-1.9.0

Cynwell commented 2 years ago

I just tested my codes on temporary Google Colab and met the error same as you. But soon I realized that it is because the GloVe word embeddings haven't finished uploading yet. Please run my codes again after finish uploading the entire embeddings file. Below is a successful run screenshot on Colab: image

Codes used to set up the environment are as follows (No need to update packages):

!git clone https://github.com/Cynwell/Text-Level-GNN.git
!mv Text-Level-GNN/* .
!mkdir embeddings
!rm -rf Text-Level-GNN sample_data

Then upload the GloVe embeddings file into the folder embeddings/.

Finally, run the program with the following line:

!python train.py --cuda=0 --embedding_size=50 --p=3 --min_freq=2 --max_length=70 --dropout=0 --epoch=10

By the way, it takes a super long time to upload the embeddings file to the temporary Colab runtime environment. It might be much faster if you upload to the Google Drive instead and mount your Google Drive to the Colab.

Abdullahi-Sherko commented 2 years ago

_Finally!!! Thank you so much. Finally based on your guidance it was runned on vocabulary embedding_size of 50._

If it is possible for you I need a way to have connection with you. It can be an email or any way that is easy for you. I have some questions on code that maybe here would not be suitable to ask. Thank you so much again.

Cynwell commented 2 years ago

Glad to hear that! My email is cynwelllau@gmail.com.