ChEB-AI / python-chebai

GNU Affero General Public License v3.0
12 stars 4 forks source link

Refactor Tox21MolNet #56

Open aditya0by0 opened 2 months ago

aditya0by0 commented 2 months ago

Note: This came up when adding unit tests for this class. Therefore, we should also include them when solving this PR (see #45 ).

sfluegel05 commented 1 month ago

Could you give me the command you used for the tox21 dataset? I have some trouble loading it (probably not related to this issue)

aditya0by0 commented 1 month ago

Could you give me the command you used for the tox21 dataset? I have some trouble loading it (probably not related to this issue)

I don't have the exact command, as I just called the method on the object of the class. I was mainly focused on testing the method rather than the specific command.

sfluegel05 commented 1 month ago

I had some trouble with the download function:

  File "C:\Users\Simon Flügel\AppData\Local\Programs\Python\Python311\Lib\urllib\request.py", line 251, in urlretrieve
tfp = open(filename, 'wb')
      ^^^^^^^^^^^^^^^^^^^^
PermissionError: [Errno 13] Permission denied: 'C:\\Users\\SIMONF~1\\AppData\\Local\\Temp\\tmpjwyb15vt'

This is not an error with the file location not being accessible, but (as far as I understand), the download function first opens a temporary file and then calls urlretrieve for the file name. This should be fixed by commit 29cff11

However, this still does not lead to a functioning dataset - the class does not apply any tokenisation and the collate function receives an unprocessed SMILES string. @MGlauer You have done some runs with the MolNet class - did you use an already existing dataset or do you have another preprocessing pipeline for the Tox21MolNet dataset?