OpenMS / OpenMS

The codebase of the OpenMS project
https://www.openms.de
Other
470 stars 308 forks source link

test workflows added by eugen to KNIME hub #6678

Closed timosachsenberg closed 1 year ago

timosachsenberg commented 1 year ago

Assigned to @Ayesha-Feroz. Please coordinate with @enetz . Thanks!

jpfeuffer commented 1 year ago

preferably on the command line to be able to automate it in the future.

timosachsenberg commented 1 year ago

@Ayesha-Feroz is new to KNIME so she would probably also need some guidance regarding what to do if it needs to be run from the command line

jpfeuffer commented 1 year ago

https://www.knime.com/faq#q12

enetz commented 1 year ago

The workflows are now in the 'OpenMS Tutorial Workflows' public KNIME hub space: https://hub.knime.com/openms-team/spaces/OpenMS%20Tutorial%20Workflows/latest/~A07Zg2qhcHp7-ecv/

I added links to the abibuilder archive and descriptions on where the example input data should be put.

cbielow commented 1 year ago

I pretended to know nothing about KNIME, and attempted a fresh install for the current KNIME 4.7.1, following import of any of the workflows in https://github.com/OpenMS/Tutorials/tree/master/Workflows.

image followed by image

Then I tried the OpenMS website to find installation instructions for KNIME --- that also went wrong, since the installation instructions here https://www.openms.de/install/#installing-openms-in-knime are 1) out of date (the menu items in the current KNIME 4.7.1 look different) 2) the bespoke KNIME File Handling Nodes package does not exist 3) the installation of the OpenMS plugin is truncated .. and the user needs to figure out what to add from the screenshot by manully typing... https://update.knime.com/community-contributions/trusted/4.3 This KNIME version is also pretty old. Trying 4.7 does not find OpenMS ouch. Then I tried https://update.knime.com/community-contributions/trusted/4.6 just for fun and seems to work...

However, as a normal user I would have given up way earlier. With this state of the docs (installation instructions on the openms website and where to find KNIME workflows (see https://github.com/OpenMS/Tutorials/issues/205) this is unreleasable.

jpfeuffer commented 1 year ago

We were kicked out of trusted because of Sirius. see #6244 and related

jpfeuffer commented 1 year ago

When this is fixed everything works starting from your second picture.

jpfeuffer commented 1 year ago

Download from https://www.knime.com/downloads works for me. The only checkbox I need to click is to Accept the terms.

  1. is correct. since 4.3 this is apparently not needed anymore. I fixed the docs.
  2. Yes, the website docs need to be fixed. But the menu items actually look the same for me.
Ayesha-Feroz commented 1 year ago

I am agreed with @cbielow about mentioning it on the documentation page on discord. The links circled on the documentation page seem broken.

Screenshot 2023-02-16 at 11 38 35
jpfeuffer commented 1 year ago

thanks. but this is another problem https://github.com/OpenMS/openms.de/issues/58

Ayesha-Feroz commented 1 year ago

Plugin and dependency installers are outdated and not found in the documentation as it's written in the documentation 4.6 platform but its 4.7 and then we have some missing plugins from the KNIME & Extensions category screen shot is added as reference Screenshot 2023-02-16 at 10 42 40

jpfeuffer commented 1 year ago

I recommend KNIME 4.6 for now. That's also why documentation is not updated yet. Dependency installer links will also be shown after installation at KNIME startup. (https://abibuilder.cs.uni-tuebingen.de/archive/openms/OpenMSInstaller/PrerequisitesInstaller/)

jpfeuffer commented 1 year ago

I made some little fixes to the docs and removing unnecessary links but the main problem persists. OpenMS is not available on the 4.7 KNIME update site until I split the plugin.

By the way, the prerequisites should not be needed anymore for the next release so I removed them. Even if, they are only needed for MSConvert file conversion.

Ayesha-Feroz commented 1 year ago

Hi @jpfeuffer I have followed these steps 1)From the ‘Work with:’ drop-down list, select the update site ‘KNIME Community Extensions (Trusted) - https://update.knime.com/community-contributions/trusted/4.6

2)Now select the following plugin from the “KNIME Community Contributions - Bioinformatics & NGS” category

OpenMS

3)Click on Next and follow the instructions and after a restart of KNIME the OpenMS nodes will be available in the Node repository under “Community Nodes”. Then I encountered a problem attached below. Can you guide me on how to solve this issue? <img width="1512" alt="Screenshot 2023-02-17 at 14 46 48" src=

Screenshot 2023-02-17 at 14 46 28

"https://user-images.githubusercontent.com/94114994/219676093-04ff7bc3-9526-4cc8-841a-6d70ba9147db.png">

jpfeuffer commented 1 year ago

How did you install KNIME? It looks like it is inside an AppTranslocation limbo. Did you actually drag it to some place on your computer?

Ayesha-Feroz commented 1 year ago

1) I have downloaded it from here https://www.knime.com/downloads. 2)and then

Screenshot 2023-02-17 at 15 14 43
jpfeuffer commented 1 year ago

And then? How did you open it?

Ayesha-Feroz commented 1 year ago
Screenshot 2023-02-17 at 15 18 03
jpfeuffer commented 1 year ago

What did you click? You cannot click the KNIME logo from your screenshot. You must navigate to Applications and open it from there. Is that what you did?

Ayesha-Feroz commented 1 year ago

@jpfeuffer yes you are right this is due to when I launch and use the existing opened window of Knime it shows this problem but when I close it and open it again from applications it is then working fine. thanks for your help.

timosachsenberg commented 1 year ago

what is the current state? done? move to backlog?

timosachsenberg commented 1 year ago

what is the current state?

timosachsenberg commented 1 year ago

please comment on current state

timosachsenberg commented 1 year ago

Please update current state

Ayesha-Feroz commented 1 year ago

@timosachsenberg I want to discuss a few things with Eugen, he is in a conference right now, I will update you here then.

Ayesha-Feroz commented 1 year ago

Until now I have tested on the Metabolite Spectral ID Workflow using example input data. Furthermore, I have tested a basic peptide identification workflow on label-free example data, as well as basic peptide identification with inference. These workflows were loaded onto my Knime GUI and the corresponding example data was loaded and tested within the GUI environment.

timosachsenberg commented 1 year ago

How to install nightly plugins:

  1. Add our build archive as plugin source image

  2. install nodes image

I will test the workflows from https://github.com/OpenMS/Tutorials on my Windows machine.

jpfeuffer commented 1 year ago
timosachsenberg commented 1 year ago
timosachsenberg commented 1 year ago
jpfeuffer commented 1 year ago

If not on command line: I would have suggested checking the box of the folder, clicking download and just extracting the tar into the KNIME Workspace. Less load on both sides.

timosachsenberg commented 1 year ago

I think the tar plugin is broken ... at least last time it had some issues

jpfeuffer commented 1 year ago

FTP would be much nicer anyway

enetz commented 1 year ago

working on the KNIME workflows I found a few more issues:

  1. one more place where the path in Windows seems to be broken: in the workflow Identification_quantification_with_inference_isobaric_epifany_MSstatsTMT there is an R View node using MSstats at the end, it gets its input file path from a URI port to Variable node. The raw file path it creates looks like this: file:/C:/Users/eugen/AppData/Local/Temp/knime_Identification_19483/fs-MSsta_3-240-96751/000/000/MSstatsConverter_0/PAMI-176_Mouse_A-J_TMT_40ug_14pctACN_25cm_120min_20160223_OT.csv The R script removes the file: string in the beginning, as this is necessary for macOS and Linux as well, but the first / is still there and the script fails to load the file. We might be able to code a Windows specific workaround in the R script, but that first / shouldn't be there in the first place. Not sure if we can do anything about URI port to Variable though.

  2. Timo suggested using the mountpoint as the origin for the example data paths, instead of the workflows themselves. Using File Importer with Read from Mountpoint LOCAL shows some weird glitches for me on Windows. Selecting Mountpoint defaults to LOCAL in the second box and the node seems to work. It shows file counts of the files in the given path and seems to select the correct files. But they never arrive at the next node, returning empty lists or something. E.g. the ZipLoopStart for multiple mzML files says: Execute failed: Index 0 out of bounds for length 0 XTandemAdapter for the fasta database says: Configure failed (IndexOutOfBoundsException): Index 0 out of bounds for length 0 Changing the value of LOCAL to one of the other greyed out options says The selected mountpoint <option> is not connected., but after changing it once, it only lets me switch between the 3 greyed out options and not back to LOCAL. Using Read from Relative to Current mountpoint works, with the intended effect I believe. So I am changing all the workflow inputs and outputs to this format for now.

  3. SIRIUS always returns this error message (with SiriusAdapter and AssayGeneratorMetabo): Failing process stderr: [Error: Unexpected internal error (Sirius was executed, but an empty output was generated)] SIRIUS now requires a username and password it seems? This might be the problem. SiriusAdapter has parameters for that. Do we have a default OpenMS account there, or is that impossible? For the AssayGeneratorMetabo this might be more complicated, since it doesn't have such parameters, so it might not have been updated yet.

  4. In File Exporter, the Create missing folders checkbox doesn't seem to do anything. If every folder in the output path is not already there, this error is returned:

    ERROR File Exporter        7:80       java.nio.file.NoSuchFileException: C:\DATA\KNIME_Test\Workspace\Example_Data\TEST_RESULTS\basic_peptide_identification\lfq_spikein_dilution_1.idXML
    ERROR File Exporter        7:80       Execute failed: C:\DATA\KNIME_Test\Workspace\Example_Data\TEST_RESULTS\basic_peptide_identification\lfq_spikein_dilution_1.idXML

    In this example C:\DATA\KNIME_Test\Workspace\Example_Data\TEST_RESULTS\ exists, just the workflow specific folder basic_peptide_identification in it was missing. After creating that folder manually, it works normally.

jpfeuffer commented 1 year ago
  1. Always was the case. Needs if-else for windows
  2. Do you have logs for what happens with LOCAL? What are the other mountpoints? Unless you have a server or hub connected it makes sense that they are greyed out. The switching back and forth is probably a KNIME bug.
  3. I think we should remove SIRIUS altogether and find alternatives. It just become infeasible to maintain adapters. For now the only thing to advise is to expose user and password as String input variables to the user.
  4. I will maybe check but I honestly don't want to work on these nodes any much longer.
jpfeuffer commented 1 year ago

Thanks for testing. Good finds.

jpfeuffer commented 1 year ago

@enetz I could fix 4 easily. But I cannot reproduce 2. I think I would need more info for this, to see if it is a windows-only problem.

jpfeuffer commented 1 year ago

I also avoid copying in the case of LOCAL now. It always worked on my Mac both with LOCAL (even with copying) and with "Relative to current mountpoint". But maybe this helps with your case, too.

enetz commented 1 year ago

The nightlies haven't caught up with your PR yet, so I can't test the changes now.

So a bit more about the previous state (KNIME 4.7.2, WIndows 10): About 2: this is the console output:

ERROR XTandemAdapter       3:78       Configure failed (IndexOutOfBoundsException): Index 0 out of bounds for length 0
ERROR ZipLoopStart         3:2        Execute failed: Index 0 out of bounds for length 0
ERROR XTandemAdapter       3:78       Configure failed (IndexOutOfBoundsException): Index 0 out of bounds for length 0

The File Importer node is green, the ZipLoopStart after it has a red cross and the XTndemAdapter after the loop node as well.

If I open the Loaded Files of the FIle Importer: usually there should be Model Content with child-0, child-1 and child-2 nodes with folder icons on their left and file paths inside. With Mountpoint LOCAL I only see Model Content with a grey dot on its left. The ZipLoops tables are also empty.

Those other greyed out Mountpoints that cause the config window to glitch are knime-temp-space, My-KNIME-Hub and EXAMPLES. In the image you can also see that while LOCAL is selected, is seems to recognize the files and selects 3 of 5 as expected.

Mount_LOCAL1

jpfeuffer commented 1 year ago

Can you check if you can update GKN? I think only GKN was affected

enetz commented 1 year ago

All nodes in nightly are from last week, thursday.

jpfeuffer commented 1 year ago

Hmm.. nightly trigger broken. sirius tests broken. I guess that needs to be fixed first.

timosachsenberg commented 1 year ago

Any idea what is wrong?

jpfeuffer commented 1 year ago

no, did not check. no triggers despite commits in the last days speak for faulty deliveries to the jenkins API endpoint. you can try manual triggers to exclude general problems. And well Sirius, same problems as always. Accounts not set up correctly, server down, new version server-side that messes up ordering, ... did I forget something?

timosachsenberg commented 1 year ago

@axelwalter can you take a look at Sirius in jenkins?

enetz commented 1 year ago

More about the 1. point in my list above, regarding the R View MSstats node in Identification_quantification_with_inference_isobaric_epifany_MSstatsTMT.

After fixing the input path in that R node, it processes some stuff for a while and has a long output text in the R console. Then it shows this error before finishing: Error: Assertion on 'colnames of contrast matrix' failed: Must be a permutation of set {'Long_LF','Long_M','Short_LF','Short_HF','Long_HF'}, but has extra elements {'0.125','0.5','0.667','1'}.

There is also this line in the code:

# Set the column names
colnames(comparison)<- c("0.125", "0.5", "0.667", "1")

After some additional testing, it is clear that this R View node in is written for one specific TMT dataset. That one is 35 GB large. It doesn't work with the other < 2GB TMT dataset we have. The are some NULL outputs from print lines before the error messages, indicating that some data structures that should be populated are empty. So I think it is not just those column names that don't match. I am not familiar with MSstats yet, so this might take a while for me to figure out, if we don't want to stick to the 35GB dataset, which I am unable to run through to the end to test on my Laptop, it takes way too long for an "example".

enetz commented 1 year ago

Otherwise, issues 2 and 4 are fixed in today's nightlies.

File Exporter: Creating folders works now

File Importer: Mountpoint LOCAL works just like Relative to Current Mountpoint. Actually exactly the same way, so I am not sure what the difference is. Should I update the workflows to Mountpoint LOCAL , or leave with Relative to Current Mountpoint as is?

jpfeuffer commented 1 year ago

You will only notice a difference when executing the workflow on a server/hub, from your local Analytics platform. If I understand it correctly it will then fetch/upload the files from LOCAL.

This might be a benefit from LOCAL, however I don't know if LOCAL is a standardized name or if people can have different names (where it would then fail). Maybe just keep "Current mountpoint" for now.

enetz commented 1 year ago

Update on the R code:

I think I fixed it, mostly. The final step, the MSstats groupComparisonPlots that draws Volcano Plots returns an error in the R View node, but after saving and loading the data in RStudio, it works as expected with literally the same data and the same line of code for the plot. This is something that must be happening inside the MSstats function groupComparisonPlots, but I can't tell why it works in RStudio and not in KNIME. Same versions of R and MSstats and other packages.

The error message is Error: attempt to apply non-function which means something that is not a function is being used as one, e.g. if you forget a * and write a(b+c) instead of a*(b+c) in an equation. This might be a weird issue with my specific KNIME setup, so I would appreciate it if someone else could run it on their end. It is the current version of Identification_quantification_with_inference_isobaric_epifany_MSstatsTMT in the KNIME Hub (https://hub.knime.com/openms-team/spaces/Tutorial%20Workflows%20OpenMS%203.0/latest/Identification_quantification_with_inference_isobaric_epifany_MSstatsTMT~w-9tSOJoo8kNCUxt) and expects the IsobaricTMT data in the new Example_Data archive (https://abibuilder.cs.uni-tuebingen.de/archive/openms/Tutorials/Example_Data/).