Noble-Lab / lupine

Mass spectrometry proteomics imputation with a multilayer perceptron
MIT License
1 stars 0 forks source link

Negative imputed values #1

Open RTMorris1 opened 1 week ago

RTMorris1 commented 1 week ago

I am having an issue running Lupine. I installed the software and downloaded the joint quantification data from github without any difficulty. However when I ran the tool on the joint quantification data table I was surprised to see that over 97% of the missing values were imputed with negative values. When I compared my results to the published results available on zenodo.org I didn't see any negative values (focusing on the BRCA results). I ran lupine with the following command: lupine impute --outpath lupine_test1c_out/ --device cuda joint_quantifications.csv

lincoln-harris commented 4 days ago

Hi @RTMorris1 thanks for your interest in the tool. A couple questions that might help get to the bottom of this:

  1. What operating system are you on?
  2. What did the program output? Were there any error messages? Attaching a screenshot of the command line and the tool's output might be helpful here.
  3. What is your ultimate goal? Are you just hoping to use CPTAC protein quants that have been imputed with Lupine? Or do you have additional MS runs that you're hoping to impute?

I can also take a look at the files that were uploaded to Zenodo as well as the joint_quantifications.csv that is part of the package release. We're still in pre-beta phase so it's possible that there is a bug hiding somewhere.

Lincoln

RTMorris1 commented 3 days ago

Hi Lincoln, 1.) the OS is Ubuntu 20.04 2.) The command used was lupine impute --outpath lupine_test1c_out/ --device cuda joint_quantifications.csv

Lupine_TestRun

3.) i want to use the tool to impute missing quantified protein values for >1000 in house TMT MS3 tissue experiments. I don't want to combine them with the CPTAC experimental data.

I was running lupine on the joint quantification file in order to test the tool.

lincoln-harris commented 2 days ago

This should be resolved by commit ea6735ab62f8362e16629b9f056b558ebefad35e. This was hard to track down but I think the learning rate for the Adam optimizer was set too high by default.

For your use case, considering you have >1000 TMT experiments, you should be able to skip directly to the impute step directly on a matrix of protein (or peptide) quants (skipping the convert and join steps).

Let me know if this patch works for you and I'll close this issue.