Closed reckart closed 6 years ago
I have deployed a component and tried to run it on the platform.
For an application you can directly run it after you registration. If it is a component this is not possible.
For an application you can directly run it after you registration. If it is a component this is not possible.
I know. I have built a workflow which makes use of the component that I had deployed (cf. : #7)
FYI @azielinskiACC
OK I had a look into Galaxy. VariableMentionDisambiguator is a UIMA component with the following coordinates
eu.openminted.uc-tdm-socialsciences ss-variable-detection 1.0.1-SNAPSHOT
It is available on Maven Central ? zoidberg public snapshots? OMTD repo? -> the executor that we have does not look there.
Also the workflow is created in OMTD Workflow Editor instance of Galaxy. Then OMTD Registry copies it OMTD Workflow Execution instance of Galaxy. Do you know the name of the workflow so I can check if it is there?
OMTD repo? -> the executor that we have does not look there.
It is in the OMTD SNAPSHOTs repo. The registry seems to be able to resolve artifacts from there. Would it be possible to ensure that the executors and the registry use the same sets of repos to look up components, best also in the same order.
The workflow URL is: https://test.openminted.eu/landingPage/application/c58d1986-690e-40b9-b408-f649443c7d33
It is in the OMTD SNAPSHOTs repo. The registry seems to be able to resolve artifacts from there. Would it be possible to ensure that the executors and the registry use the same sets of repos to look up components, best also in the same order.
Until now it was not required. Added it on my TO-DO list.
The workflow URL is: https://test.openminted.eu/landingPage/application/c58d1986-690e-40b9-b408-f649443c7d33
Downloaded the metadata record from Registry (attached). The workflow name is 0931730980607790@openminted.eu 13865a76-613b-475a-88bf-4af5357b9263
I downloaded it from Galaxy executor (also attached). It is empty, no steps. Probably this is why it fails. It seems a Registry issue.
I'll try building a new one.
Ok. Please sent me the landing page as you did with previous one. I will download the metadata record find the Galaxy workflow and check if it is OK. If it is not we have to inform Antonis.
Ok, I have created a new one. This time, it is not empty when I re-open it in the workflow editor:
https://test.openminted.eu/landingPage/application/89d5e9ea-32fb-45f7-bf00-1fe466e33c4f
However, it still fails:
@azielinskiACC @galanisd note that I have pasted a full multi-line XML file into the parameter variableSpecification - not sure if that could cause a problem. Aside from the XML getting a bit sqashed down when pasting it into the input field, it seemed ok in the Galaxy editor.
<?xml version="1.0" encoding="UTF-8"?>
<variables>
<variable v_id="140" correct="YesNo">
<v_label>INGLEHART-INDEX </v_label>
<v_topic>Political attitudes and participation</v_topic>
<v_question> What are your political priorities? </v_question>
<v_subquestion> </v_subquestion>
<v_answer a_id="1">Postmaterialist</v_answer>
<v_answer a_id="2">Postmaterialist mixed-type</v_answer>
<v_answer a_id="3">Materialist mixed-type</v_answer>
<v_answer a_id="4">Materialist</v_answer>
<v_answer a_id="5">Don't know</v_answer>
<v_answer a_id="99">No answer</v_answer>
</variable>
</variables>
The other thing is that the component should try to download a model from the OMTD Maven repo. That means it must have network access to that repo.
<groupId>eu.openminted.uc-tdm-socialsciences</groupId>
<artifactId>ss-variable-detection-model-disambiguation-en-ss</artifactId>
<version>20180406.1</version>
Hm... that said, it might actually try to download the model from the wrong repo (i.e. the DKPro Core repo instead of the OMTD repo...). That is something I need to look into locally.
Opened an issue regarding model-auto-downloads here: https://github.com/openminted/omtd-component-executor/issues/1
Yes now it not empty. The workflow is this 0931730980607790@openminted.eu 3c6c03b5-9a04-41bb-996a-a2cd536c7ace
I see a the following error in the logs workflow-service which is the module that call Galaxy.
--- [ Thread-625] e.o.w.service.WorkflowServiceImpl : Unable to locate workflow: 0931730980607790%40openminted.eu+3c6c03b5-9a04-41bb-996a-a2cd536c7ace
Maybe it has to do with the name of the workflow. It contains spaces and a "@" which are escaped at some point. @courado @greenwoodma @antleb
Ok. I have:
Then I tried running the workflow again on the variable test corpus that @azielinskiACC has published on the platform.
Still, I get a failure again.
Any idea what could be the reason now?
I assume that again the workflow-service fails to call the workflow that was created @ Galaxy executor. As I said above probably the reason is the name of the workflow.
I've just pushed a fix for this that should URL decode the workflow name before looking for it in Galaxy. This should get built and pushed to beta automatically but won't end up on test until someone manually pulls in the latest workflow service code.
I have also added the error message supplied from the workflow service under the My Operations page
@courado great! :)
I just tried running the workflow again, but it fails being unable to locate the named workflow.
Could somebody please push @greenwoodma `s fix to test.openminted.eu?
@reckart is it not possible to rename the workflow to avoid the bug until the fix is pushed to test?
@greenwoodma how do I do that? The workflow editor only has a "save" button, not a "rename" or "save as" button as far as I remember.
I think that the only way to do that is
a. rename the workflow in Galaxy b. download the metadata record of the app. delete it from the registry c. upload an updated metadata record with the new workflow name.
@reckart hmmm I thought the name of the workflow came from the name you gave the app in the registry UI, but maybe not, or maybe you can't change it there either. Certainly the workflow editor just gets passed the name from the platform it doesn't generate it.
Well, the name I have given to the workflow in the registry UI is "Simple Variable Disambiguation Example (English)". 0931730980607790@openminted.eu 3c6c03b5-9a04-41bb-996a-a2cd536c7ace
looks like an auto-generated ID over which I probably do not have control. My guess would be that it is a representation of the user-id concatenated with some other ID...
What's weird is that if all workflow IDs are generated the same way then how have we ever run a workflow as we'd have hit this issue every time? I'm seriously confused by this one.
Apparently one can edit the workflow name in Galaxy by clicking on the pre-generated name, entering a new value and pressing ENTER. I did that (see screenshot).
However, when I press "save" now, nothing happens. Odd...
Ok, when I go back to "My applications" and re-open the workflow in the editor, I can see that the name I put is still there, so I guess the "save" must have worked.
I wonder what happens if I created a second workflow by the same name...
Anyway, running the now re-named workflow still gives me the same message:
Failed
Unable to locate named workflow
@courado the "My operations" view has a date, but not a time stamp. It would be great if we could also see the submission and possibly completion times of the execution there.
@greenwoodma
Workflow names @ Galaxy are not generated with the same way.
Also workflow ID is a different thing that workflow name. For each workflow name there is an internal unique workflow ID; the one you retrieve in workflow-service from Galaxy so that you initiate a workflow execution.
@greenwoodma
Apparently one can edit the workflow name in Galaxy by clicking on the pre-generated name, entering a new value and pressing ENTER. I did that (see screenshot).
I assume that this shouldn't be allowed and should be hidden as some other things @ Galaxy Editor.
@galanisd yep, that most definitely shouldn't be allowed. I'll add it to the list of things I need to fix.
OK great! It is not a blocking issue but this is
The applications that are created in Galaxy editor and then ingested in OMTD Registry seem to have this problem.
@courado
OK great! It is not a blocking issue but this is
The applications that are created in Galaxy editor and then ingested in OMTD Registry seem to have this problem.
Indeed. We cannot proceed in the SSH UC (WP9) due to this issue at the moment.
@azielinskiACC
Ok so:
user id + UUID
that is generated when you create a new application with the workflow editor.registry
passes the name generated upon creation to the workflow service
and thus won't find it [see save workflow save step 3].TODO
I need to add a loading screen when you press save.When you save the workflow the steps are :
When you load a workflow : If it already in the editor, show it else load it from the registry and show it
PS @galanisd:
Thanks @courado that all makes sense. Looks like the only issue is that the auto generated name doesn't get unencoded properly when passed into the workflow service hence it's looking for a workflow containing %40 instead of @ etc. My fix was to add a decode call inside the workflow service which should solve the problem once that code is deployed to test.
I agree the workflow IDs are volatile as they change everytime you export/import them into Galaxy, the only thing that's fixed is the workflow name so we do need that to be autog-enerated and unique so the current approach is great, I just need to fix the editor to stop people being able to change the name.
Thanks @courado that all makes sense. Looks like the only issue is that the auto generated name doesn't get unencoded properly when passed into the workflow service hence it's looking for a workflow containing %40 instead of @ etc. My fix was to add a decode call inside the workflow service which should solve the problem once that code is deployed to test.
@courado Is it possible to redeploy only workflow-service @ test so that we can check if Mark's fix works? I think this is the easiest solution.
I've just pushed a fix to the galaxy editor branch which stops you editing the workflow name from within the editor so that should appear next time test is fully updated with the latest versions of everything (assuming updating test includes the galaxy editor)
I am still getting Failed - Unable to locate named workflow
.
Could anybody please update test.openminted.eu with the fixes that were discussed and implemented?
I assume this will not be the last issue in the attempt of getting the SSH UC components running on the platform... and time is running out quickly.
@reckart apparently test.openminted.eu has now been updated (sometime yesterday morning, plus again right now) so if you could try running your workflow again and see what happens?
Well... guess what: Failed - Unable to locate named workflow
Damn, damn, damn and damn!
What I don't understand is that I tried to reproduce this myself by creating a new workflow through the galaxy editor and it worked. Having said that it looks as if it's worked because the workflow name doesn't contain an @ symbol like yours does.
Could you try creating a new workflow to see if that works (i.e. if something in the registry has changed the way it creates workflow names). The only other thing I can think of is that while test has been updated the workflow service is still the old one, but I'm not sure how to check that. @galanisd any ideas how we would check if the latest code had made it to test?
also @reckart did you ever change the workflow name back after you managed to edit it? If not that would certainly screw things up
@greenwoodma I have no idea what the old name was.
@reckart well that explains things then. Looking earlier in the issue I think it was
0931730980607790@openminted.eu 13865a76-613b-475a-88bf-4af5357b9263
if you can change it back to that then it might work, otherwise you need to create a new workflow to see if this has been fixed or not
I have rebuilt the workflow from scratch... now it is "running". Let's see if it terminates.
@reckart is it worth closing this issue then, given it's now quite long and focused on the workflow name bug, and then opening another one if it fails with a different error?
@greenwoodma I have changed the issue title to "OpenMinTeD SSH UC Hackathon" - while the workflow name issue seems to be resolve now, the workflow still has not completed successfully.
@reckart makes sense, just didn't know if you wanted a clean slate to report new issues, but renaming it for the hackathon makes sense
What is the usual time between status "running" and "completed"?
@courado as a feature request to the registry:
The workflow is trivial and the corpus is rather small, still the workflow is still in "running" state after three hours...
@reckart how small is small? @antleb e-mailed me earlier about a slow running workflow. I'm beginning to wonder if the galaxy executor instance has been redeployed without the speed up fix we worked on for the issue that arose during the Paris meeting. I'm not entirely sure how to go about checking if that fix is in place or not -- will try and dig out the details for logging into the machine to check.
The corpus description says "one file"
https://test.openminted.eu/landingPage/corpus/9f4ebc21-aebe-4fb2-90c9-59bd189b9619
The corpus browser doesn't seem to work on that particular corpus.
Someone has sent to executor ~2 hour ago an 1 GB corpus.
It will take ages.
okay daft question then..... don't we allow parallel executions? I thought that was the point of the cloud backend?
Did you try to download it? https://test.openminted.eu/landingPage/corpus/9f4ebc21-aebe-4fb2-90c9-59bd189b9619
It is empty. So assume no data were feeded to the workflow. I think that in such cases workflow-service is not able to understand that processing has finished. Not sure.
@courado @antleb But why is empty?
I have deployed a component and tried to run it on the platform. The result of the operation is listed as "FAILED", but I have no idea why. How can one get access to the log output?
Instance: test.openminted.eu