galaxyproject / galaxy

Data intensive science for everyone.
https://galaxyproject.org
Other
1.42k stars 1.01k forks source link

Replace Text replaces intentionally left open "Replace with" with "[Object object]" #18545

Open Sch-Da opened 4 months ago

Sch-Da commented 4 months ago

Describe the bug When running the "Replace Text" Tool (toolshed.g2.bx.psu.edu/repos/bgruening/text_processing/tp_replace_in_line/9.3+galaxy1) in a workflow, the intentionally left blank "replace with" part is unintentionally filled with "[object Object]". It runs smoothly outside of a workflow.

I am trying to use the "Replace Text" Tool to clean my text and delete some unnecessary passages. As I wanted to delete text, I did not fill the "Replace with" box. It works and gives a clean output when I run the tool alone. When it is part of a workflow, the "Replace with" that I intentionally left blank seems to be automatically filled with "[object Object]", resulting in a rather useless file.

Galaxy Version and/or server at which you observed the bug Galaxy Version: version_major | "24.1" version_minor | "2.dev0"

Browser and Operating System Operating System: Windows Browser: Firefox

To Reproduce Steps to reproduce the behavior:

  1. Go to WORKFLOW https://usegalaxy.eu/u/schnda/w/copy-of-comparing-differences-in-two-english-texts
  2. Run workflow with two text files - for example with https://openbible.com/textfiles/akjv.txt and https://openbible.com/textfiles/kjv.txt
  3. Run workflow in steps 5 + 6 of the workflow, pre-processing of text 1 and 2, I wanted to remove some text.

For "find pattern" I inserted: ^(.*?)\t

The Replace, I left open. But when checking again after running, it was replaced by [Object object]

Expected behavior Cleaning of text, removal of a part of the input, as shown here. This works when running alone - but somehow not in the workfow.

Screenshots this is the outcome if run independently screenshot 158

what it actually looks like in the workflow screenshot 157

Thanks for looking into it!

wm75 commented 4 months ago

Reproducing/debugging this is made more complicated by #18546 but I think this gets "fixed" by viewing the empty param on the WF run form. I'm able to reproduce the issue when just running the linked WF as is.

mvdbeek commented 4 months ago

While this is a bug, I would strongly encourage you to use a workflow parameter for this usecase. https://training.galaxyproject.org/training-material/topics/galaxy-interface/tutorials/workflow-parameters/tutorial.html#add-an-integer-workflow-parameter shows how to do this for integer parameters but it works the same way for text parameters.

Sch-Da commented 4 months ago

Thanks for pointing this out @mvdbeek. I checked the tutorial, but could you quickly explain the benefits of the workflow parameter here? That was not clear to me from the tutorial. Do you mean by using the parameter and setting it to text, I can likely avoid getting the error? And: Is this a best practice to use in general or rather a workaround limited to this case? Thank you!

Sch-Da commented 4 months ago

It worked in this workflow, where I got the cleaner text out. https://usegalaxy.eu/u/schnda/h/parameter-text

However, if I add the next steps of cleaning my text and want to remove the punctuation mark with the same tool, it again adds gibberish. If I am not mistaken, regex [^\w\s] should catch all the commas, dots, etc. I thought leaving the "remove" panel blank should just remove them. Instead, despite putting both inputs as a workflow parameter with text, the output replaces everything with various amounts of s and w.

screenshot 159

See history here: https://usegalaxy.eu/u/schnda/h/comparing---2-regex-text-parameters-for-removal-used

Wondering if it's me or a bug...

wm75 commented 4 months ago

Now that is not a bug (and certainly not WF-related) but just a consequence of the regex flavor used by the sed tool: escaped character class symbols like \w and \s simply don't work inside square brackets but are interpreted as the literal characters. In other words, you're discarding everything that is not a "w" or an "s". If you want to use character classes, take a look at, e.g., https://www.gnu.org/software/sed/manual/html_node/Character-Classes-and-Bracket-Expressions.html.

mvdbeek commented 4 months ago

Thanks for pointing this out @mvdbeek. I checked the tutorial, but could you quickly explain the benefits of the workflow parameter here?

Sure, the main one is that we use the modern workflow run form, which doesn't let users alter the workflow, potentially in ways that are not valid. There's many ways in which the old workflow run form is broken (beyond the bug you found). Think of workflows like a program you write, you wouldn't want users to change things in the source code, instead you want to clearly show and describe what the valid parameters are. This is what workflow parameters do. They're also recorded for posterity and you can see them under the inputs tab of the executed workflow. If users change a parameter right inside the workflow it's kind of complicated to find out if and what they changed.

Sch-Da commented 4 months ago

Thanks @wm75 and @mvdbeek for your helpful feedback! I will incorporate this from now on.

Sch-Da commented 1 week ago

Hi all, I was wondering if someone is still working on this issue. While the workaround works for when I use the expanded, full workflow form, it does not in the short version, where the bug persists.

Workflow: https://usegalaxy.eu/u/schnda/w/copy-of-copy-of-comparing-differences-in-two-english-texts

History: https://usegalaxy.eu/u/schnda/h/comparing-differences-in-two-texts-word-based

Here, I filled in the inputs without expanding the workflow and in step 21, I wanted to remove punctuation, leaving "replace with" blank, which is automatically filled with [object Object] runining the output. I would be happy for any suggestions as I had hoped to share this as a training in the beginning of next year for a conference.

mvdbeek commented 1 week ago

Can you narrow this down to a smaller example and share an invocation id with us ? I've created a smaller example workflow with just step 21 in https://usegalaxy.org/workflows/invocations/4c9a7b1e8376b2f0?from_panel=true and that works fine. Or are you saying that I need to modify step 21 ? I don't see any Replace with parameter there.

Sch-Da commented 1 week ago

We could be referring to two different steps: I meant step 21 in the history, which included REPLACE TEXT.

Here is a shorter version of the history: https://usegalaxy.eu/u/schnda/h/minimal-example-regex-error-v1 Here, the error occurs in step 5, replace text on data 3. This is the shorter workflow: https://usegalaxy.eu/u/schnda/w/minimal-example-comparing-differences-in-two-texts-word-based-1

If this is still too long, I will minimise it again tomorrow. Thanks for checking

mvdbeek commented 6 days ago

I think I understand the problem now. You've added an input connection to step 9 Find pattern, but haven't connected anything to it, when I think that to cast everything to lowercase you want the fixed value .* there ? Similarly, in step 7 you've made the replacement a runtime input when it should probably be an empty string so all matches are removed ?

Is https://usegalaxy.eu/u/m.vandenbeek/w/minimal-example-comparing-differences-in-two-texts-word-based-imported-from-uploaded-file what you want to do ?

Step 7

Screenshot 2024-11-19 at 10 02 54

Step 9

Screenshot 2024-11-19 at 10 03 03

Note that steps are workflow steps, items in the history are history items.

I do see multiple things that could be considered bugs:

Sch-Da commented 6 days ago

Yes, https://usegalaxy.eu/u/m.vandenbeek/w/minimal-example-comparing-differences-in-two-texts-word-based-imported-from-uploaded-file is what I want to do. The aim is to clean and unify text for later comparison. This part of the workflow aims to be open to the users' needs, depending on what text they want to compare later. (Those parts are deleted in this minimal example) If the text is already pre-processed, those steps are irrelevant and, therefore, not filled.

I managed to break down the workflow further and hope that helps narrow down the issue. https://usegalaxy.eu/u/schnda/w/copy-of-minimal-example-comparing-differences-in-two-texts-word-based

the error occurs here, when the replacement I want to make is, for example [[:punct:]], to remove punctuation: https://usegalaxy.eu/u/schnda/h/minimal-example-comparing-differencesv2

However, it works smoothly when I input [[:punct:]] in the fully expanded workflow form https://usegalaxy.eu/u/schnda/h/minimal-example-comparing-differencesv2-expanded

Do you see what I mean? When I click on the third item in the history, the difference is that the one with the bug auto-filled the replacement, which I wanted to leave blank with [object object]. Therefore, I suspect this is the cause of the error.

screenshot 191

mvdbeek commented 6 days ago

Sorry, those histories don't help, could you let me know which steps in the workflow I would need to look at ? Do you confirm that you've removed the runtime parameters ?

Sch-Da commented 6 days ago

Please take a look at Step 4 "clean text one", which is to replace text. Sorry, but I don't get what runtime parameters mean in this context. The best practice panel does not contain any warnings on runtime values in this version, if that helps

mvdbeek commented 6 days ago

See my screenshot of step 7, the caret controls whether something is a runtime parameter.

mvdbeek commented 6 days ago

This is what you want to do in step 4, "clean text one"

Screenshot 2024-11-19 at 11 33 54

If you want users to be able to change the value, you need to make it a connection and add an optional text input.