oda-hub / nb2workflow

GNU General Public License v3.0
1 stars 5 forks source link

missing compulsory argument in nbadapter call to NB2WProductQuery from dispatcher-plugin-nb2workflow #199

Open ferrigno opened 2 months ago

ferrigno commented 2 months ago
File ~/Soft/nb2workflow/nb2workflow/nbadapter.py:1096, in validate_oda_dispatcher(nba, optional, machine_readable)
   1094         raise
   1095 else:
-> 1096     nbpq = NB2WProductQuery('testname', 
   1097                     'testproduct', 
   1098                     nba.extract_parameters(),
   1099                     nba.extract_output_declarations())
   1100                     
   1102     output = nba.extract_output()

TypeError: NB2WProductQuery.__init__() missing 1 required positional argument: 'ontology_path'

Introducing the oda_ontology_path dedfined in the file gives another error

File ~/Soft/nb2workflow/nb2workflow/nbadapter.py:1096, in validate_oda_dispatcher(nba, optional, machine_readable)
   1094         raise
   1095 else:
-> 1096     nbpq = NB2WProductQuery('testname', 
   1097                     'testproduct', 
   1098                     nba.extract_parameters(),
   1099                     nba.extract_output_declarations(),
   1100                     oda_ontology_path)
   1102     output = nba.extract_output()
   1104     logger.debug(json.dumps(output, indent=4))

File ~/Soft/dispatcher-plugin-nb2workflow/dispatcher_plugin_nb2workflow/queries.py:102, in NB2WProductQuery.__init__(self, name, backend_product_name, backend_param_dict, backend_output_dict, ontology_path)
    100 self.backend_output_dict = backend_output_dict
    101 self.backend_param_dict = backend_param_dict
--> 102 parameter_lists = construct_parameter_lists(backend_param_dict, ontology_path)
    103 self.par_name_substitution = parameter_lists['par_name_substitution']
    104 plist = deepcopy(parameter_lists['prod_plist'])

File ~/Soft/dispatcher-plugin-nb2workflow/dispatcher_plugin_nb2workflow/queries.py:24, in with_hashable_dict.<locals>.wrapper(backend_param_dict, ontology_path)
     22 @wraps(func)
     23 def wrapper(backend_param_dict, ontology_path):
---> 24     return func(HashableDict(backend_param_dict), ontology_path)

File ~/Soft/dispatcher-plugin-nb2workflow/dispatcher_plugin_nb2workflow/queries.py:19, in HashableDict.__hash__(self)
     18 def __hash__(self):
---> 19     return hash(dumps(self, sort_keys=True))

File ~/.conda/envs/py10/lib/python3.10/json/__init__.py:238, in dumps(obj, skipkeys, ensure_ascii, check_circular, allow_nan, cls, indent, separators, default, sort_keys, **kw)
    232 if cls is None:
    233     cls = JSONEncoder
    234 return cls(
    235     skipkeys=skipkeys, ensure_ascii=ensure_ascii,
    236     check_circular=check_circular, allow_nan=allow_nan, indent=indent,
    237     separators=separators, default=default, sort_keys=sort_keys,
--> 238     **kw).encode(obj)

File ~/.conda/envs/py10/lib/python3.10/json/encoder.py:199, in JSONEncoder.encode(self, o)
    195         return encode_basestring(o)
    196 # This doesn't pass the iterator directly to ''.join() because the
    197 # exceptions aren't as detailed.  The list call should be roughly
    198 # equivalent to the PySequence_Fast that ''.join() would do.
--> 199 chunks = self.iterencode(o, _one_shot=True)
    200 if not isinstance(chunks, (list, tuple)):
    201     chunks = list(chunks)

File ~/.conda/envs/py10/lib/python3.10/json/encoder.py:257, in JSONEncoder.iterencode(self, o, _one_shot)
    252 else:
    253     _iterencode = _make_iterencode(
    254         markers, self.default, _encoder, self.indent, floatstr,
    255         self.key_separator, self.item_separator, self.sort_keys,
    256         self.skipkeys, _one_shot)
--> 257 return _iterencode(o, 0)

File ~/.conda/envs/py10/lib/python3.10/json/encoder.py:179, in JSONEncoder.default(self, o)
    160 def default(self, o):
    161     """Implement this method in a subclass such that it returns
    162     a serializable object for ``o``, or calls the base implementation
    163     (to raise a ``TypeError``).
   (...)
    177 
    178     """
--> 179     raise TypeError(f'Object of type {o.__class__.__name__} '
    180                     f'is not JSON serializable')

TypeError: Object of type type is not JSON serializable

this goes beyond my understanding.

[I used the master branch of all repositories with python 3.10

dsavchenko commented 2 months ago

We haven't used this functionality anywhere for quite a while, so this function became incompatible with current dispatcher/plugin. And it stays untested as long as there is no more dispatcher/plugin in dependencies.

@ferrigno in which scenario do you need it?

@volodymyrss do we really want it like this, so that dispatcher/plugin is an optional dependency of nb2workflow?

ferrigno commented 1 month ago

@dsavchenko , I was trying to run a test notebook for my workflows following the suggested example.

See: https://gitlab.renkulab.io/astronomy/mmoda/jemx-expert/-/blob/main/test_products.ipynb?ref_type=heads

I run it from my local virtual environment where I installed nb2worflow and dispatcher-plugin-nb2workflow from the git repository as suggested by @volodymyrss (I had to install also the dispatcher and I did it from the git repository)

dsavchenko commented 1 month ago

You don't really need the dispatcher/plugin in this context. You won't encounter the exception if you remove them from the virtual environment. We still need to adapt this part, thought, so let's keep this issue open

volodymyrss commented 1 month ago

I think the plugin was used to construct products from the test output, also verifying that the output annotation works.

That is, it should be used. It's not great to use a dispatcher plugin for this and it would be better to use product classes from oda-api. But I do not suppose its possible now, @dsavchenko ?

ferrigno commented 1 month ago

I can just confirm that I had to introduce the plugin because I encountered an error and could not obtain any result from the test. Removing it gives unable to import dispatcher_plugin_nb2workflow.queries.NB2WProductQuery: No module named 'dispatcher_plugin_nb2workflow'

dsavchenko commented 1 month ago

I can just confirm that I had to introduce the plugin because I encountered an error and could not obtain any result from the test. Removing it gives unable to import dispatcher_plugin_nb2workflow.queries.NB2WProductQuery: No module named 'dispatcher_plugin_nb2workflow'

Unsurprisingly. You explicitly import it in test nb, but never use it after in fact

ferrigno commented 1 month ago

The explicit import was just left from a test, the error appears also without import.

ferrigno commented 1 month ago

https://renkulab.io/projects/astronomy/mmoda/isgri-expert/sessions/new?autostart=1&commit=bf08578d3050bce09000312050929d8692f03cde&branch=main you can run the notebook test from this session

dsavchenko commented 1 month ago

There are both dispatcher and plugin in requirements. If you don't use them explicitly for some specific reason, you can remove both, then you won't encounter the exception.

dsavchenko commented 1 month ago

I think the plugin was used to construct products from the test output

One can try to do it, but I don't think we ever did this explicitly.

It was used to test, how dispatcher "understands" annotations. But this information then isn't reflected anywhere as far as I remember (if available, it stays as variable in the container). Despite general problems with complicated requirements, this also increases the container size, which is rather big in any case.

volodymyrss commented 1 month ago

I think the plugin was used to construct products from the test output

One can try to do it, but I don't think we ever did this explicitly.

It was used to test, how dispatcher "understands" annotations. But this information then isn't reflected anywhere as far as I remember (if available, it stays as variable in the container).

This was done and even sent by email, extracted from the container in here .

Would be nice to recover this functionality along with bot tests, else it is progressively harder to make sure the workflows remain valid.

Despite general problems with complicated requirements, this also increases the container size, which is rather big in any case.

It's true that it's better to do it from outside of the container, within oda-bot.

ferrigno commented 1 month ago

The import of nb2workflow is as suggested in the development guide ... at least when I started this project. The dispatcher plug-in is commented out. I let you discuss and solve this issue, as it goes beyond my field of competence. For me, it is just inconvenient that I cannot run a test as suggested.

volodymyrss commented 1 month ago

The import of nb2workflow is as suggested in the development guide ... at least when I started this project. The dispatcher plug-in is commented out. I let you discuss and solve this issue, as it goes beyond my field of competence. For me, it is just inconvenient that I cannot run a test as suggested.

We are discussing with @dsavchenko what is the best next step for sustainable implementation, in relation to https://github.com/oda-hub/oda-bot/issues/38 .

If you @ferrigno want to help, you could maybe propose to make a change to the example or manual? To reflect your experience. Would be useful!

dsavchenko commented 1 month ago

The import of nb2workflow is as suggested in the development guide ...

The nb2workflow is obviously needed, as it's directly used in tests. But one normally don't need any dispatcher-related dependencies in the session. Some of them may be eventually automatically used in the deployment process, thought

volodymyrss commented 1 month ago

@anawas could you please try to run a test in your repository, run it, and see if you have the same issue. If yes (probably you will), try to fix it.

If not, please help @ferrigno with his repository, it's all the same mechanism.

anawas commented 1 month ago

I can reproduce the error on my machine. Additionally, I get the following warnings/error:

T1="2022-10-22T05:18:22.0" #http://odahub.io/ontology#StartTime --> Unknown datatype for owl_uri http://odahub.io/ontology#StartTime
T2='2022-10-22T23:10:16.0' #http://odahub.io/ontology#EndTime --> Unknown datatype for owl_uri http://odahub.io/ontology#EndTime
...

And there are complaints about unknown owl type uri such as

Unknown owl type uri http://odahub.io/ontology#Float. Creating basic Parameter object.
Unknown owl type uri http://odahub.io/ontology#String. Creating basic Parameter object.
Unknown owl type uri http://odahub.io/ontology#Boolean. Creating basic Parameter object.
Unknown owl type uri http://odahub.io/ontology#String. Creating basic Parameter object.

I am not sure if this is somehow related.

ferrigno commented 1 month ago

I corrected some wrong annotations in the parameter cell for both isgri and jemx expert repositories, but not T1 and T2, as they are valid ontology terms, also suggested in the development guide.

If I change the annotations of T1 and T2 into StartTimeISOT, the warning disappears, but the error in running with nb2workflow remains, so there is no relation.

ferrigno commented 1 month ago

This should be related https://github.com/oda-hub/dispatcher-plugin-nb2workflow/issues/111 . @anawas you can correct that class and it will help this issue.

anawas commented 1 month ago

I found the line in the code where the exception is raised. The class NB2WProductQuery requires the argument ontology_path (in project dispatcher-plugin-nb2workflow, file queries.py).

  1. What is this ontology_path? I cannot find either a documentation nor a type hint.
  2. And where should this parameter be set?
volodymyrss commented 1 month ago

I found the line in the code where the exception is raised. The class NB2WProductQuery requires the argument ontology_path (in project dispatcher-plugin-nb2workflow, file queries.py).

1. What is this ontology_path? I cannot find either a documentation nor a type hint.
2. And where should this parameter be set?

It's a path to ontology file to be loaded. You can see in this case and in other similar cases how this is done in the tests.

You should be able to set a default published ontology path.

Just could you please cross-check that it is used as intended?

anawas commented 1 month ago

The follwing url return 404 when called. It's in file ontology.py. https://raw.githubusercontent.com/FnOio/fnoio.github.io/master/ontology/0.4.1/function.ttl What is it for? I wanted to use this url to get a default ontology.

In addition I wonder if this is the cause for the frequent not a turtle: None messages.

dsavchenko commented 1 month ago

https://raw.githubusercontent.com/FnOio/fnoio.github.io/master/ontology/0.4.1/function.ttl

That's not our default ontology. It was used for somehow unrelated purposes. @volodymyrss may have more insight Anyway, it's requested in try-except, so it can't raise issue.

Our ontology is served from https://odahub.io/ontology/ontology.ttl and this path is hardcoded as default in the code where applicable.

volodymyrss commented 1 month ago

Yes, the fno is an old one. They have an update somewhere. But we should use our own (it has the references to upstream where suitable).

anawas commented 1 week ago

I can now run locally some of the tests here: astronomy/mmoda/mmoda-nb2workflow-example. However, when I run Carlo's notebook (https://gitlab.renkulab.io/astronomy/mmoda/jemx-expert/-/blob/main/test_products.ipynb?ref_type=heads) I get a timeout for the following url, which seems not to be online: dispatcher-staging.obsuks1.unige.ch

What is it used for? What is it supposed to return? I may need to write a mock for it.

ferrigno commented 1 week ago

that URL is available only in the virtual network of the University of Geneva. You can use the public endpoint https://www.astro.unige.ch/mmoda/dispatch-data instead

ferrigno commented 1 week ago

or use 'host_type' : 'production' instead of 'host_type' : 'staging'

The current version of the test NB at https://gitlab.renkulab.io/astronomy/mmoda/isgri-expert/-/blob/main/test_products.ipynb?ref_type=heads is updated. I am running it again so that you can get the products from the backend (for independent reasons, they are not anymore cached).