saezlab / pypath

Python module for prior knowledge integration. Builds databases of signaling pathways, enzyme-substrate interactions, complexes, annotations and intercellular communication roles.
http://omnipathdb.org/
GNU General Public License v3.0
134 stars 47 forks source link

`ValueError: I/O operation on closed file` in `mapping.py` #248

Open bnymnsen opened 1 year ago

bnymnsen commented 1 year ago

When I try to map from genesymbol to entrez id, I get the following error:

ValueError                                Traceback (most recent call last)
~\AppData\Local\Temp\ipykernel_4272\2864091030.py in <module>
----> 1 mapping.map_name("ABCC1", "genesymbol", "entrez")

C:\anaconda3\lib\site-packages\pypath\utils\mapping.py in map_name(name, id_type, target_id_type, ncbi_tax_id, strict, expand_complexes, uniprot_cleanup)
   3562     mapper = get_mapper()
   3563 
-> 3564     return mapper.map_name(
   3565         name = name,
   3566         id_type = id_type,

C:\anaconda3\lib\site-packages\pypath\share\common.py in wrapper(*args, **kwargs)
   2770     def wrapper(*args, **kwargs):
   2771         try:
-> 2772             return func(*args, **kwargs)
   2773         except TypeError as error:
   2774             if 'unhashable type' in str(error):

C:\anaconda3\lib\site-packages\pypath\utils\mapping.py in map_name(self, name, id_type, target_id_type, ncbi_tax_id, strict, expand_complexes, uniprot_cleanup)
   1980 
   1981             # all the other ID types
-> 1982             mapped_names = self._map_name(
   1983                 name = name,
   1984                 id_type = id_type,

C:\anaconda3\lib\site-packages\pypath\utils\mapping.py in _map_name(self, name, id_type, target_id_type, ncbi_tax_id)
   2512         ncbi_tax_id = ncbi_tax_id or self.ncbi_tax_id
   2513 
-> 2514         tbl = self.which_table(
   2515             id_type,
   2516             target_id_type,

C:\anaconda3\lib\site-packages\pypath\utils\mapping.py in which_table(self, id_type, target_id_type, load, ncbi_tax_id)
   1676                         )
   1677 
-> 1678                         reader = MapReader(
   1679                             param = this_param,
   1680                             ncbi_tax_id = ncbi_tax_id,

C:\anaconda3\lib\site-packages\pypath\utils\mapping.py in __init__(self, param, ncbi_tax_id, entity_type, load_a_to_b, load_b_to_a, uniprots, lifetime, resource_id_types)
    255         self._resource_id_types = resource_id_types
    256 
--> 257         self.load()
    258 
    259 

C:\anaconda3\lib\site-packages\pypath\utils\mapping.py in load(self)
    285 
    286             # read from the original source
--> 287             self.read()
    288 
    289             if self.tables_loaded():

C:\anaconda3\lib\site-packages\pypath\utils\mapping.py in read(self)
    447         if hasattr(self, method):
    448 
--> 449             getattr(self, method)()
    450 
    451 

C:\anaconda3\lib\site-packages\pypath\utils\mapping.py in read_mapping_uniprot_list(self)
    635             else:
    636 
--> 637                 u_target = self._read_mapping_uniprot_list(
    638                     uniprot_id_type_a = 'UniProtKB_AC-ID',
    639                     uniprot_id_type_b = self.param.uniprot_id_type_a,

C:\anaconda3\lib\site-packages\pypath\utils\mapping.py in _read_mapping_uniprot_list(self, uniprot_id_type_a, uniprot_id_type_b, upload_ac_list, chunk_size)
    843                 res_c = curl.Curl(**run_args)
    844 
--> 845             result.extend(list(res_c.fileobj)[1:])
    846 
    847         return result

ValueError: I/O operation on closed file.

And here is the log file:

[2023-06-26 19:36:56] [curl] Creating Curl object to retrieve data from `https://www.ensembl.org/info/about/species.html`
[2023-06-26 19:36:56] [curl] Cache file path: `C:\Users\ASUS\.pypath\cache\535b06d53a59e75bb693369bc5fdc556-species.html`
[2023-06-26 19:36:56] [curl] Cache file found, no need for download.
[2023-06-26 19:36:56] [curl] Opening plain text file `C:\Users\ASUS\.pypath\cache\535b06d53a59e75bb693369bc5fdc556-species.html`.
[2023-06-26 19:36:56] [curl] Contents of `C:\Users\ASUS\.pypath\cache\535b06d53a59e75bb693369bc5fdc556-species.html` has been read and the file has been closed.
[2023-06-26 19:36:56] [curl] Creating Curl object to retrieve data from `https://www.ensembl.org/info/about/species.html`
[2023-06-26 19:36:56] [curl] Cache file path: `C:\Users\ASUS\.pypath\cache\535b06d53a59e75bb693369bc5fdc556-species.html`
[2023-06-26 19:36:56] [curl] Cache file found, no need for download.
[2023-06-26 19:36:56] [curl] Opening plain text file `C:\Users\ASUS\.pypath\cache\535b06d53a59e75bb693369bc5fdc556-species.html`.
[2023-06-26 19:36:56] [curl] Contents of `C:\Users\ASUS\.pypath\cache\535b06d53a59e75bb693369bc5fdc556-species.html` has been read and the file has been closed.
[2023-06-26 19:36:56] [curl] Creating Curl object to retrieve data from `https://www.ensembl.org/info/about/species.html`
[2023-06-26 19:36:56] [curl] Cache file path: `C:\Users\ASUS\.pypath\cache\535b06d53a59e75bb693369bc5fdc556-species.html`
[2023-06-26 19:36:56] [curl] Cache file found, no need for download.
[2023-06-26 19:36:56] [curl] Opening plain text file `C:\Users\ASUS\.pypath\cache\535b06d53a59e75bb693369bc5fdc556-species.html`.
[2023-06-26 19:36:56] [curl] Contents of `C:\Users\ASUS\.pypath\cache\535b06d53a59e75bb693369bc5fdc556-species.html` has been read and the file has been closed.
[2023-06-26 19:36:57] [network_resources] Could not find data model for resource `Negatome` in set `negative`.
[2023-06-26 19:36:57] [network_resources] Could not find data model for resource `SignaLink2` in set `obsolate`.
[2023-06-26 19:36:57] [network_resources] Could not find data model for resource `NCI-PID` in set `obsolate`.
[2023-06-26 19:36:57] [network_resources] Could not find data model for resource `ORegAnno` in set `transcription_deprecated`.
[2023-06-26 19:36:58] [curl] Creating Curl object to retrieve data from `https://www.ebi.ac.uk/unichem/legacy/ucquery/listSources`
[2023-06-26 19:36:58] [curl] Cache file path: `C:\Users\ASUS\.pypath\cache\ec334dad78a2cd7ed88f6069c63aa672-listSources`
[2023-06-26 19:36:58] [curl] Cache file found, no need for download.
[2023-06-26 19:36:58] [curl] Loading data from cache previously downloaded from `www.ebi.ac.uk`
[2023-06-26 19:36:58] [curl] Opening file `C:\Users\ASUS\.pypath\cache\ec334dad78a2cd7ed88f6069c63aa672-listSources`
[2023-06-26 19:36:58] [curl] Extracting data from file type `plain`
[2023-06-26 19:36:58] [curl] Opening plain text file `C:\Users\ASUS\.pypath\cache\ec334dad78a2cd7ed88f6069c63aa672-listSources`.
[2023-06-26 19:36:58] [curl] Contents of `C:\Users\ASUS\.pypath\cache\ec334dad78a2cd7ed88f6069c63aa672-listSources` has been read and the file has been closed.
[2023-06-26 19:36:58] [curl] File at `https://www.ebi.ac.uk/unichem/legacy/ucquery/listSources` successfully retrieved. Resulted file type `plain text, unicode string`. Local file at
                      `C:\Users\ASUS\.pypath\cache\ec334dad78a2cd7ed88f6069c63aa672-listSources`.
[2023-06-26 19:36:58] [curl] Creating Curl object to retrieve data from `https://rampdb.nih.gov/api/id-types`
[2023-06-26 19:36:58] [curl] Cache file path: `C:\Users\ASUS\.pypath\cache\c058901753f61743bf55db935731cff1-id-types`
[2023-06-26 19:36:58] [curl] Cache file found, no need for download.
[2023-06-26 19:36:58] [curl] Opening plain text file `C:\Users\ASUS\.pypath\cache\c058901753f61743bf55db935731cff1-id-types`.
[2023-06-26 19:36:58] [curl] Contents of `C:\Users\ASUS\.pypath\cache\c058901753f61743bf55db935731cff1-id-types` has been read and the file has been closed.
[2023-06-26 19:37:00] [mapping] Requested to load ID translation table from `genesymbol` to `entrez`, organism: 9606.
[2023-06-26 19:37:00] [mapping] Chosen ID translation table from service: service=uniprot, id_type_a=genesymbol, id_type_b=entrez
[2023-06-26 19:37:00] [mapping] Reader created for ID translation table, parameters: `ncbi_tax_id=9606, id_a=genesymbol, id_b=entrez, load_a_to_b=1, load_b_to_a=0, input_type=uniprot_list (UniprotListMapping)`.
[2023-06-26 19:37:00] [uniprot_input] Loading list of all UniProt IDs for organism `9606` (only SwissProt: None).
[2023-06-26 19:37:00] [curl] Creating Curl object to retrieve data from `https://rest.uniprot.org/uniprotkb/stream`
[2023-06-26 19:37:00] [curl] GET parameters added to the URL: `query=organism_id%3A9606&format=tsv&fields=accession`
[2023-06-26 19:37:00] [curl] Cache file path: `C:\Users\ASUS\.pypath\cache\fb5c357aad69910705244a5afddde059-stream`
[2023-06-26 19:37:00] [curl] Cache file found, no need for download.
[2023-06-26 19:37:00] [curl] Loading data from cache previously downloaded from `rest.uniprot.org`
[2023-06-26 19:37:00] [curl] Opening file `C:\Users\ASUS\.pypath\cache\fb5c357aad69910705244a5afddde059-stream`
[2023-06-26 19:37:00] [curl] Extracting data from file type `plain`
[2023-06-26 19:37:00] [curl] Opening plain text file `C:\Users\ASUS\.pypath\cache\fb5c357aad69910705244a5afddde059-stream`.
[2023-06-26 19:37:00] [curl] Contents of `C:\Users\ASUS\.pypath\cache\fb5c357aad69910705244a5afddde059-stream` has been read and the file has been closed.
[2023-06-26 19:37:00] [curl] File at `https://rest.uniprot.org/uniprotkb/stream?query=organism_id%3A9606&format=tsv&fields=accession` successfully retrieved. Resulted file type `plain text, unicode string`. Local file at
                      `C:\Users\ASUS\.pypath\cache\fb5c357aad69910705244a5afddde059-stream`.
[2023-06-26 19:37:00] [mapping] Querying the UniProt ID Mapping service for ID translation data. Querying a list of 207780 IDs.
[2023-06-26 19:37:00] [mapping] Request to UniProt ID Mapping, chunk #0 with 100000 IDs.
[2023-06-26 19:37:00] [curl] Creating Curl object to retrieve data from `https://rest.uniprot.org/idmapping/run`
[2023-06-26 19:37:00] [curl] Cache file path: `C:\Users\ASUS\.pypath\cache\4554d6f83c82b732266b0010e39df0b5-run`
[2023-06-26 19:37:00] [curl] Cache file found, no need for download.
[2023-06-26 19:37:00] [curl] Creating Curl object to retrieve data from `https://rest.uniprot.org/idmapping/run`
[2023-06-26 19:37:00] [curl] Cache file path: `C:\Users\ASUS\.pypath\cache\4554d6f83c82b732266b0010e39df0b5-run`
[2023-06-26 19:37:00] [curl] Cache file found, no need for download.
[2023-06-26 19:37:00] [curl] Opening plain text file `C:\Users\ASUS\.pypath\cache\4554d6f83c82b732266b0010e39df0b5-run`.
[2023-06-26 19:37:00] [curl] Contents of `C:\Users\ASUS\.pypath\cache\4554d6f83c82b732266b0010e39df0b5-run` has been read and the file has been closed.

I am using latest version of pypath. I also tried entrez to ensg, it gave Runtime error.

deeenes commented 1 year ago

This issue is related to the recent transition from the legacy to the current UniProt APIs. Certain operations do not work yet, many others already work. Examples of failing calls are very welcome, please comment here. I'm fixing these issues right now, until complete I keep this issue pinned. Thanks @bnymnsen for reporting!