tgbugs / ontquery

a framework querying ontology terms
MIT License
3 stars 3 forks source link

Problems submitting new data element to InterLex #23

Open dbkeator opened 4 years ago

dbkeator commented 4 years ago

Hi,

I'm receiving the following errors when trying to submit a new data element to InterLex via the API:

File "/Users/dbkeator/Documents/Coding/PyNIDM/nidm/experiment/Utils.py", line 612, in AddPDEToInterlex uri_category : categorymappings File "/Users/dbkeator/opt/anaconda3/lib/python3.7/site-packages/ontquery/plugins/services/interlex.py", line 93, in add_pde predicates = predicates) File "/Users/dbkeator/opt/anaconda3/lib/python3.7/site-packages/ontquery/plugins/services/interlex.py", line 140, in add_entity cid = cid, File "/Users/dbkeator/opt/anaconda3/lib/python3.7/site-packages/ontquery/plugins/services/interlex_client.py", line 516, in add_entity raw_entity_outout = self.add_raw_entity(entity_input) File "/Users/dbkeator/opt/anaconda3/lib/python3.7/site-packages/ontquery/plugins/services/interlex_client.py", line 694, in add_raw_entity output = self.get_entity(output['ilx']) KeyError: 'ilx'

This is generated from the pynidm toolbox, code to submit the data element is here: https://github.com/incf-nidash/PyNIDM/blob/master/nidm/experiment/Utils.py#L580

I tried upgrading ontquery to version 0.2.3 but then I couldn't get the toolkit to authenticate using 0.2.3 so I don't know if the problem above would be solved or not, code is here: https://github.com/incf-nidash/PyNIDM/blob/master/nidm/experiment/Utils.py#L580

tgbugs commented 4 years ago

I think that 0.2.3 should have fixed the KeyError error. What kind of authentication issues are you running into?

tgbugs commented 4 years ago

Looking at this I think the issue is here https://github.com/incf-nidash/PyNIDM/blob/5e9ffaf75cd98b45c41fa94c10ff95a7d99b070b/nidm/experiment/Utils.py#L66, the test endpoint should be https://test3.scicrunch.org/api/1/ now.

dbkeator commented 4 years ago

What about the production endpoint? Still https://scicrunch.org/api/1/ ?

The error with 0.2.3 when trying to initialize is: module 'orthauth' has no attribute 'configure_here'

From this line: https://github.com/incf-nidash/PyNIDM/blob/master/nidm/experiment/Utils.py#L567

Doesn't seem to matter wether I'm using the test instance or the production one (given the above URL is correct). Thx

tgbugs commented 4 years ago

Argh. I think the issue is that there was a gap in my install_requires which I thought had been fixed, but apparently not. Try running pip install --upgrade orthauth (or the equivalent for the environment you are in), and see if that fixes the issue.

dbkeator commented 4 years ago

Hi, So after running the upgrade command (upgraded from 0.0.8 to 0.0.12) the authorization problem is fixed. I'll move forward with testing adding personal data elements in a little bit. Need to do a few other things first....

Thanks!

dbkeator commented 4 years ago

Hi, Seems to be giving me the same error:

File "/Users/dbkeator/opt/anaconda3/lib/python3.7/site-packages/ontquery/plugins/services/interlex.py", line 93, in add_pde predicates = predicates) File "/Users/dbkeator/opt/anaconda3/lib/python3.7/site-packages/ontquery/plugins/services/interlex.py", line 140, in add_entity cid = cid, File "/Users/dbkeator/opt/anaconda3/lib/python3.7/site-packages/ontquery/plugins/services/interlex_client.py", line 516, in add_entity raw_entity_outout = self.add_raw_entity(entity_input) File "/Users/dbkeator/opt/anaconda3/lib/python3.7/site-packages/ontquery/plugins/services/interlex_client.py", line 694, in add_raw_entity output = self.get_entity(output['ilx']) KeyError: 'ilx'

This is generated from the pynidm toolbox, code to submit the data element is here: https://github.com/incf-nidash/PyNIDM/blob/master/nidm/experiment/Utils.py#L580

Dave

tmsincomb commented 4 years ago

Looking into now, I'll let you know as soon as possible when I have a solution.

tgbugs commented 4 years ago

That is not the line that I would expect if the code that was running was from 0.2.3 it looks like you are still running 0.2.2.

dbkeator commented 4 years ago
[~/Downloads/datasets.datalad.org/adhd200/RawDataBIDS/Brown]$ which python
/Users/dbkeator/opt/anaconda3/bin/python
[~/Downloads/datasets.datalad.org/adhd200/RawDataBIDS/Brown]$ python
Python 3.7.4 (default, Aug 13 2019, 15:17:50) 
[Clang 4.0.1 (tags/RELEASE_401/final)] :: Anaconda, Inc. on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> import ontquery
>>> import ontquery as on
>>> print(on.__version__)
0.2.3
dbkeator commented 4 years ago

...trying some further debugging, stepping into add_pde function ...

tmsincomb commented 4 years ago

I think there may be an issue with the update, being a file isn't being merged properly with the "git pull". The line "output = self.get_entity(output['ilx'])" is no longer on line 694, but actually on 624 https://github.com/tgbugs/ontquery/blob/master/ontquery/plugins/services/interlex_client.py#L624

dbkeator commented 4 years ago

Ok, I see what's happening which will be a problem form our plan with Jeff (we may have to change the plan). The error is because I have already added a term with the same label. Error comes from here: https://github.com/tgbugs/ontquery/blob/master/ontquery/plugins/services/interlex_client.py#L610.

Our most recent plan was to have all users input their unique data element definitions and properties as a personal data element. In this particular case I may have the "gender" data element with different categorical encodings depending on what study I'm annotating (e.g. in some studies I have Male and Female, in other studies I might have M and F or M=1, F=2, or Male, Female, Other, etc.).

So we changed the pynidm code where it just asks the user to input their data element properties and then tries to insert it into InterLex then asks the user to associate a term (concept) with the data element (that part simply does a query of existing terms in InterLex). It appears the idea of uploading all personal data elements to the InterLex which may have overlaps with existing personal data elements but with different categorical encodings won't work...unless you have another idea?

Thanks

tmsincomb commented 4 years ago

We can add a parameter "allow_duplicate_label=True" in all the add entity functions to bypass this. @tgbugs What do you think?

However that line is associated to an existing term check that wouldn't exit, but return the existing term and print a log. There still may be a merge issue from 0.2.2 to 0.2.3. Are you by chance using a jupyter notebook to run these scripts?

tgbugs commented 4 years ago

I was thinking about allow_duplicate_label, but I don't think that is an issue, except maybe during testing if @dbkeator is trying to insert terms with the same label over and over again. Individual users in this use case definitely should not be bypassing that check, the label is not the display label, it needs to have more information, so if it is "gender in study x" and "gender in study y" that is something that needs to be in the label to distinguish them.

Based on the line numbers it does seem that there is some stale code though and I want to make sure we get that cleared up before jumping to any conclusions.

dbkeator commented 4 years ago

@tmsincomb No, not using jupyter notebooks. I've run the code in PyCharm and on the command line. The result from 'add_pde' just says {KeyError} 'ilx'...so not sure what that means.

I'll step through the code again and see if I can get any further clarity on where exactly that error occurs.

tmsincomb commented 4 years ago

add_pde and any other add entity based functions will have the same error traced backed to your line 694 within the old interlex_client.py. The init for the interlex_client.py's IterLexClient class is

    def __init__(self, base_url: str = default_base_url):
        """Short summary.
        :param str base_url: . Defaults to default_base_url.
        """
        InterlexSession.__init__(self,
                                 key = self.api_key,
                                 host = base_url)
        self.base_url = base_url
        self.user_id = self._get('user/info')['id']

Where the init is only 10 lines long in 0.2.3 version. Can you please verify if that init matches with yours?

dbkeator commented 4 years ago

entity: {'type': 'pde', 'definition': 'Gender of participant', 'uid': 33551, 'label': 'gender', 'ilx': 'ilx_0738263'} From this line: https://github.com/tgbugs/ontquery/blob/master/ontquery/plugins/services/interlex_client.py#L603 error message: {'errormsg': 'gender already exists and was created by you', 'success': False}

Then it skips this line: https://github.com/tgbugs/ontquery/blob/master/ontquery/plugins/services/interlex_client.py#L613

and here's the bug, no 'ilx' in the error message: https://github.com/tgbugs/ontquery/blob/master/ontquery/plugins/services/interlex_client.py#L624

dbkeator commented 4 years ago

@tmsincomb Looks the same:

`def init(self, base_url: str = default_base_url): """Short summary.

    :param str base_url: . Defaults to default_base_url.
    """
    InterlexSession.__init__(self,
                             key = self.api_key,
                             host = base_url)
    self.base_url = base_url
    self.user_id = self._get('user/info')['id']`
tgbugs commented 4 years ago

So it looks like we do have a bug in the handling of the case where there is a term with the same label, does it work if you give it a label like 'gender-2'? If it does then we know that is the extent of the immediate issue and we can work to improve the error reporting while we determine what the right thing to do is with regard to duplicate labels in this case. Pulling in @jgrethe to provide some context on https://github.com/tgbugs/ontquery/issues/23#issuecomment-650313993.

dbkeator commented 4 years ago

Hi, changing the label to 'gender-2' worked in the sense that it got beyond my current problem. Now I have a problem with one of the properties 'ilx_0382131' which it says does not exist. That property I believe I received from @jgrethe which should be the 'data type' but doesn't seem to exist in InterLex at that ID. I see 2 DICOM terms, one called 'data type' and the other 'value type'. Not sure if @jgrethe thinks one of those existing terms is appropriate for this property or not. What we're encoding for 'data type' is the XSD schema type of value for this particular data element (e.g. xsd:integer, xsd:string, etc.).

cheers

jgrethe commented 4 years ago

Sorry for not jumping in sooner - was teaching and then rolled into an NIH review panel...

The issue is due to Interlex treating all terms equally and applying the same rules. You as a user defining a concept or PDE. For concepts, the naming restrictions may be appropriate. However, for PDE's I can see a user have multiple datasests with PDEs of the same name that may have different attributes.

Need to look into ilx_0382131 as it shows up as an annotation but seems to be DICOM related.

In Interlex we have relationships such as the below. And these are not what you are looking for.

has datum type Description: The type of datum (e.g. measurement, statistics, ...) which is a child of IAO_0000109 Preferred ID: ILX:0738262 Type: relationship ID# ILX:0738262 Score: 0

has measurement type Description: The type of measurement being conducted Preferred ID: ILX:0381388 Type: relationship ID# ILX:0381388 Score: 0

tmsincomb commented 4 years ago

@dbkeator Can I get the first 6 digits of the commit sha from the git log head? I want to try the exact version you're using to recreate the issue.

dbkeator commented 4 years ago

@tmsincomb I assume you're asking about pynidm? If so, use: https://github.com/dbkeator/PyNIDM, commit: f5f92f37177d2839b42927a506b2683ccaff8c7b. If you're asking about ontquery then I'm using the latest version one gets with pip install onquery (0.2.3). Not sure how to get the commit hash for pip-installed packages. Thanks!

tmsincomb commented 4 years ago

@dbkeator I'm sorry for being a broken record, but I wanted to ask more questions to debug https://github.com/tgbugs/ontquery/issues/23#issuecomment-650296297. Can I get the input that crashed add_pde? Would it be okay if you run

pip install --force-reinstall ontquery

to make sure there isn't bad code not being updated properly by default? I tested 0.2.3, 0.2.2, and the pip ontquery in it's own environment, but I can't replicate the error. If you keep getting the error

File "/Users/dbkeator/opt/anaconda3/lib/python3.7/site-packages/ontquery/plugins/services/interlex_client.py", line 694, in add_raw_entity
output = self.get_entity(output['ilx'])

Where that code doesn't exist anymore on line 694, this would indicate a pip install bug that might need a fresh install to clear.

dbkeator commented 4 years ago

Hi, So after the --force-install command, the only error I have now is with the interlex property I'm trying to use not existing which is fixable on my end:

"relationship_ilx: http://uri.interlex.org/base/ilx__0382131 does not exist".

This is supposed to be the "data type" or "value type" relation for a new PDE. Is there a more appropriate (@jgrethe)? I'm not looking for the measurement type or the datum type. This is straight forward data type (xsd:integer, xsd:float, xsd:string, etc.). I don't think that's what we intended for either datum type or measurement type....

Thanks