a-r-j / ProteinWorkshop

Benchmarking framework for protein representation learning. Includes a large number of pre-training and downstream task datasets, models and training/task utilities. (ICLR 2024)
https://proteins.sh/
MIT License
194 stars 16 forks source link

Unable to donwload dataset ec_reaction #94

Closed yangzhang33 closed 2 months ago

yangzhang33 commented 3 months ago

when passing dataset=ec_reaction, got error: URLError: <urlopen error [Errno -2] Name or service not known>

截屏2024-07-05 13 47 47

Tested on two machines and internet

a-r-j commented 3 months ago

Hi @yangzhang33, thanks for the bug report; we'll investigate.

In the meantime, you can download the dataset from here: https://zenodo.org/records/8282470 or build it from source yourself.

a-r-j commented 3 months ago

Ah, on second thought are you using format="mmtf"? The PDB has sunset hosting of MMTF files earlier this month. You can try switching to format="pdb" until we add BCif support.

yangzhang33 commented 3 months ago

Thank you, if I understand well, in the code it's 'ent', I will try the one you mentioned

a-r-j commented 3 months ago

Ah I see. 'ent' should be okay (it's really pdb). Are you using the datamodule as an import in a script or are you training from the CLI? If you're using the CLI then the thing to check is the config value: https://github.com/a-r-j/ProteinWorkshop/blob/61294d4bafab7779121cf4eaa4435742b61b709a/proteinworkshop/config/dataset/ec_reaction.yaml#L5

yangzhang33 commented 3 months ago

Yes thank you for this, indeed it is 'mmtf' in config, so I am using 'mmtf' instead of 'ent'.

yangzhang33 commented 2 months ago

@a-r-j Hello, I tried the solution and the dataset is successfully downloaded, there are ECReaction and pdb folder in my dataset folder, but still can't get the model running, the following error occurred:

RecursionError Traceback (most recent call last)> RecursionError Traceback (most recent call last) File ~/miniconda3/envs/3d/lib/python3.10/site-packages/IPython/core/formatters.py:226, in catch_format_error(method, self, *args, kwargs)> File ~/miniconda3/envs/3d/lib/python3.10/site-packages/IPython/core/formatters.py:226, in catch_format_error(method, self, *args, *kwargs) 225 try:> 225 try: --> 226 r = method(self, args, kwargs)> --> 226 r = method(self, *args, **kwargs) 227 except NotImplementedError:> 227 except NotImplementedError: 228 # don't warn on NotImplementedErrors> 228 # don't warn on NotImplementedErrors

File ~/miniconda3/envs/3d/lib/python3.10/site-packages/IPython/core/formatters.py:916, in IPythonDisplayFormatter.call(self, obj)> File ~/miniconda3/envs/3d/lib/python3.10/site-packages/IPython/core/formatters.py:916, in IPythonDisplayFormatter.call(self, obj) 915 try:> 915 try: --> 916 printer = self.lookup(obj)> --> 916 printer = self.lookup(obj) 917 except KeyError:> 917 except KeyError:

File ~/miniconda3/envs/3d/lib/python3.10/site-packages/IPython/core/formatters.py:397, in BaseFormatter.lookup(self, obj)> File ~/miniconda3/envs/3d/lib/python3.10/site-packages/IPython/core/formatters.py:397, in BaseFormatter.lookup(self, obj) 396 # then lookup by type> 396 # then lookup by type --> 397 return self.lookup_by_type(_get_type(obj))> --> 397 return self.lookup_by_type(_get_type(obj))

File ~/miniconda3/envs/3d/lib/python3.10/site-packages/IPython/core/formatters.py:427, in BaseFormatter.lookup_by_type(self, typ)> File ~/miniconda3/envs/3d/lib/python3.10/site-packages/IPython/core/formatters.py:427, in BaseFormatter.lookup_by_type(self, typ) 426 for cls in pretty._get_mro(typ):> 426 for cls in pretty._get_mro(typ): --> 427 if cls in self.type_printers or self._in_deferred_types(cls):> --> 427 if cls in self.type_printers or self._in_deferred_types(cls): 428 return self.type_printers[cls]> 428 return self.type_printers[cls]

File ~/miniconda3/envs/3d/lib/python3.10/site-packages/IPython/core/formatters.py:564, in BaseFormatter._in_deferred_types(self, cls)> File ~/miniconda3/envs/3d/lib/python3.10/site-packages/IPython/core/formatters.py:564, in BaseFormatter._in_deferred_types(self, cls) 563 key = (mod, name)> 563 key = (mod, name) --> 564 if key in self.deferred_printers:> --> 564 if key in self.deferred_printers: 565 # Move the printer over to the regular registry.> 565 # Move the printer over to the regular registry. 566 printer = self.deferred_printers.pop(key)> 566 printer = self.deferred_printers.pop(key)

File ~/miniconda3/envs/3d/lib/python3.10/site-packages/traitlets/traitlets.py:687, in TraitType.get(self, obj, cls)> File ~/miniconda3/envs/3d/lib/python3.10/site-packages/traitlets/traitlets.py:687, in TraitType.get(self, obj, cls) 686 else:> 686 else: --> 687 return t.cast(G, self.get(obj, cls))> --> 687 return t.cast(G, self.get(obj, cls))

File ~/miniconda3/envs/3d/lib/python3.10/site-packages/traitlets/traitlets.py:666, in TraitType.get(self, obj, cls)> File ~/miniconda3/envs/3d/lib/python3.10/site-packages/traitlets/traitlets.py:666, in TraitType.get(self, obj, cls) 665 else:> 665 else: --> 666 return t.cast(G, value)> --> 666 return t.cast(G, value)

RecursionError: maximum recursion depth exceeded> RecursionError: maximum recursion depth exceeded

During handling of the above exception, another exception occurred:> During handling of the above exception, another exception occurred:

RecursionError Traceback (most recent call last)> RecursionError Traceback (most recent call last) File ~/miniconda3/envs/3d/lib/python3.10/site-packages/IPython/core/interactiveshell.py:2168, in InteractiveShell.showtraceback(self, exc_tuple, filename, tb_offset, exception_only, running_compiled_code)> File ~/miniconda3/envs/3d/lib/python3.10/site-packages/IPython/core/interactiveshell.py:2168, in InteractiveShell.showtraceback(self, exc_tuple, filename, tb_offset, exception_only, running_compiled_code) 2167 else:> 2167 else: -> 2168 stb = self.InteractiveTB.structured_traceback(> -> 2168 stb = self.InteractiveTB.structured_traceback( 2169 etype, value, tb, tb_offset=tb_offset> 2169 etype, value, tb, tb_offset=tb_offset ...> ... 779 def _buffer_index(self) -> int:> 779 def _buffer_index(self) -> int: 780 """Get a thread local buffer."""> 780 """Get a thread local buffer.""" --> 781 return self._thread_locals.buffer_index> --> 781 return self._thread_locals.buffer_index

RecursionError: maximum recursion depth exceeded in comparison> RecursionError: maximum recursion depth exceeded in comparison

The config is as follows:

cfg = hydra.compose( config_name="train", overrides=[ "encoder=schnet", "task=multiclass_graph_classification", "dataset=ec_reaction", "features=ca_base", "+aux_task=none", "trainer.max_epochs=1000", ], return_hydra_config=True, )

using the tutorial notebook without any further modification.

a-r-j commented 2 months ago

Hi @yangzhang33 did you manage to resolve this? I'm not exactly sure what the issue is here. I don't see anything related to proteinworkshop in your error log. Perhaps a jupyter issue?

yangzhang33 commented 2 months ago

@a-r-j Hello, not yet, I agree it could not a problem of the library, but it doesn't work for my two machines, Idk if it works for your side?

yangzhang33 commented 2 months ago

Hi thanks, It is a problem of jupyter, when running as a py file, with format="pdb" there is no problem.