volkamerlab / teachopencadd

TeachOpenCADD: a teaching platform for computer-aided drug design (CADD) using open source packages and data
https://projects.volkamerlab.org/teachopencadd
Creative Commons Attribution 4.0 International
707 stars 196 forks source link

T032: filter_explore_activity_data error #358

Closed hamzaibrahim21 closed 1 year ago

hamzaibrahim21 commented 1 year ago

When trying to run T032 locally to prepare for merge into master the following error occurred:

filter_explore_activity_data(PAPYRUS_VERSION, adenosine_receptors)

/home/hamza/Github/teachopencadd/teachopencadd/talktorials/T032_compound_activity_proteochemometrics/talktorial.ipynb Cell 46 in ()
      [3](vscode-notebook-cell:/home/hamza/Github/teachopencadd/teachopencadd/talktorials/T032_compound_activity_proteochemometrics/talktorial.ipynb#X63sZmlsZQ%3D%3D?line=2) print(DATA)
      [5](vscode-notebook-cell:/home/hamza/Github/teachopencadd/teachopencadd/talktorials/T032_compound_activity_proteochemometrics/talktorial.ipynb#X63sZmlsZQ%3D%3D?line=4) # Filter the Papyrus bioactivity dataset and plot the distribution of activity values for the targets of interest
----> [6](vscode-notebook-cell:/home/hamza/Github/teachopencadd/teachopencadd/talktorials/T032_compound_activity_proteochemometrics/talktorial.ipynb#X63sZmlsZQ%3D%3D?line=5) ar_dataset = filter_explore_activity_data(PAPYRUS_VERSION, adenosine_receptors)

/home/hamza/Github/teachopencadd/teachopencadd/talktorials/T032_compound_activity_proteochemometrics/talktorial.ipynb Cell 46 in filter_explore_activity_data(papyrus_version, targets)
     [17] # Read downloaded Papyrus dataset in chunks, as it does not fit in memory
     [18] CHUNKSIZE = 100000
---> [19] data = papyrus_scripts.read_papyrus(
     [20]     version=papyrus_version, chunksize=CHUNKSIZE, source_path=DATA
     [21] )
     [23](vscode-notebook-cell:/home/hamza/Github/teachopencadd/teachopencadd/talktorials/T032_compound_activity_proteochemometrics/talktorial.ipynb#X63sZmlsZQ%3D%3D?line=22) # Create filter for targets of interest
     [24](vscode-notebook-cell:/home/hamza/Github/teachopencadd/teachopencadd/talktorials/T032_compound_activity_proteochemometrics/talktorial.ipynb#X63sZmlsZQ%3D%3D?line=23) target_accession_list = targets.values()

File [~/anaconda3/envs/teachopencadd_t032/lib/python3.8/site-packages/papyrus_scripts/reader.py:33](https://file+.vscode-resource.vscode-cdn.net/home/hamza/Github/teachopencadd/teachopencadd/talktorials/T032_compound_activity_proteochemometrics/~/anaconda3/envs/teachopencadd_t032/lib/python3.8/site-packages/papyrus_scripts/reader.py:33), in read_papyrus(is3d, version, plusplus, chunksize, source_path)
     31 if source_path is not None:
     32     os.environ['PYSTOW_HOME'] = os.path.abspath(source_path)
---> 33 version = process_data_version(version=version, root_folder=source_path)
     34 source_path = pystow.module('papyrus', version)
     35 # Load data types

File [~/anaconda3/envs/teachopencadd_t032/lib/python3.8/site-packages/papyrus_scripts/utils/IO.py:179](https://file+.vscode-resource.vscode-cdn.net/home/hamza/Github/teachopencadd/teachopencadd/talktorials/T032_compound_activity_proteochemometrics/~/anaconda3/envs/teachopencadd_t032/lib/python3.8/site-packages/papyrus_scripts/utils/IO.py:179), in process_data_version(version, root_folder)
    173 """Confirm the version is available, downloaded and convert synonyms.
...
--> 179 available_versions = get_downloaded_versions(root_folder) + ['latest']
    180 if version not in available_versions:
    181     raise ValueError(f'version can only be one of [{", ".join(available_versions)}]')

TypeError: unsupported operand type(s) for +: 'dict' and 'list' 

To create the environment I used : conda env create -f T032_env.yml

Packages in the talktorial are installed successefully as well.

AndreaVolkamer commented 1 year ago

@gorostiolam and/or @jesperswillem we are preparing an update of the TOC master and wanted to include your notebook as well if possible, when running it locally, @hamzaibrahim21 came across the above issue, can you maybe help?

gorostiolam commented 1 year ago

Thanks for bringing it to my attention. This is an issue that arises when trying to read Papyrus data if it has not been downloaded. Was this cell run @hamzaibrahim21 ? And was the output successful?

%%time
papyrus_scripts.download_papyrus(
    outdir=DATA, version=PAPYRUS_VERSION, nostereo=True, stereo=False, descriptors=None
)
# If you want to download the latest version of the Papyrus dataset, change 'PAPYRUS_VERSION' to 'latest'

I have pulled and run the latest version of the talktorial with a freshly installed environment (conda env create -f T032_env.yml) and I have had no issues.

hamzaibrahim21 commented 1 year ago

@gorostiolam It seems that I had a space issue that blocked data from being downloaded. It works now on my local machine. Thanks!