shubham1637 / DIAlignR

This is a R package for alignment of DIA mass-spec data
5 stars 1 forks source link

[FEATURE] Add support for IPF PTMs alignment #15

Closed singjc closed 3 years ago

singjc commented 3 years ago

Added runType="DIA_IPF" for alignment of PTMs IPF Proteomics results.

shubham1637 commented 3 years ago

When I run devtoos::test() on this branch. Few tests are failing. [ FAIL 5 | WARN 0 | SKIP 5 | PASS 560 ]

singjc commented 3 years ago

When I run devtoos::test() on this branch. Few tests are failing. [ FAIL 5 | WARN 0 | SKIP 5 | PASS 560 ]

Okay I fixed all the errors that were coming up when running the tests.

  1. Most of the errors came from other scripts that use getFeatures, because I added the maxIPFFdrQuery param, and in most cases where getFeatures is used, the params are set by position and not by it's named argument. I.e. getFeatures(fileInfo, maxFdrQuery, params[["maxIPFFdrQuery"]], runType, lapply)), I was wondering why don't we set input paramet values using the named argument, i.e,getFeatures(fileInfo=fileInfo, maxFdrQuery=maxFdrQuery, maxIPFFdrQuery=params[["maxIPFFdrQuery"]], runType=runType, lapply=lapply)? Is it just more efficient/faster to not explicitly assign?
  2. Same errors occured with getMultipeptide because I added the runType param
  3. I forgot to update the test for test_get_osw_query, because I changed where the DETECTING and IDENTIFYING transitions get selected in the query.

All tests seem to pass on my local copy now, let me know if there are still any failing tests on your tests.

══ Results ═══════════════════════════════════════════════════════════════════════════════════════════
Duration: 22.1 s

── Skipped tests  ────────────────────────────────────────────────────────────────────────────────────
• empty test (5)
• ropenms not available for testing. A conda environment with name TricEnvr is MUST for testing. (3)

[ FAIL 0 | WARN 0 | SKIP 8 | PASS 605 ]
shubham1637 commented 3 years ago

Thanks. It is working on my end as well. I will test on Streptococcus data as well to see that output matches with previous version of it. Will merge it afterwards.

shubham1637 commented 3 years ago

There is an error while doing BioConductor check due to large file size of the example data. One way to go about is to reduce the file size by having 10-20 peptides only. Or you can upload the data as data-package on Bioconductor separately and call that when you run your tests.

R CMD BiocCheck DIAlignR
* Checking package size...
    * ERROR: Package Source tarball exceeds Bioconductor size
      requirement.
        Package Size: 32.083 MB
        Size Requirement: 5.0000 MB
* Checking individual file sizes...
    * WARNING: The following files are over 5MB in size:
      'inst/ptms/xics/chludwig_K150309_004b_SW_1_16.chrom.sqMass
      inst/ptms/xics/chludwig_K150309_008_SW_1_4.chrom.sqMass
      inst/ptms/xics/chludwig_K150309_013_SW_0.chrom.sqMass'
singjc commented 3 years ago

@shubham1637 I shrunk the tests files.

shubham1637 commented 3 years ago

Hi, I was thinking about it since last week but got a chance to try it out now.

(base) shubham@shubham-desktop:~/temp$ git clone git@github.com:shubham1637/DIAlignR.git
Cloning into 'DIAlignR'...
remote: Enumerating objects: 7226, done.
remote: Counting objects: 100% (994/994), done.
remote: Compressing objects: 100% (540/540), done.
remote: Total 7226 (delta 703), reused 679 (delta 452), pack-reused 6232
Receiving objects: 100% (7226/7226), 28.31 MiB | 3.13 MiB/s, done.
Resolving deltas: 100% (5357/5357), done.
Checking connectivity... done.

PR:

(base) shubham@shubham-desktop:~$ git clone git@github.com:singjc/DIAlignR.git
Cloning into 'DIAlignR'...
remote: Enumerating objects: 7690, done.
remote: Counting objects: 100% (1085/1085), done.
remote: Compressing objects: 100% (597/597), done.
remote: Total 7690 (delta 772), reused 740 (delta 484), pack-reused 6605
Receiving objects: 100% (7690/7690), 55.04 MiB | 3.72 MiB/s, done.
Resolving deltas: 100% (5726/5726), done.
Checking connectivity... done.

As you can see the original size of git repo is 28.31 MiB. The PR size is 55.04 MiB

(base) shubham@shubham-desktop:~/DIAlignR$ ll -h inst/ptms/xics/
total 736K
drwxrwxr-x 2 shubham shubham 4.0K Jul  5 18:05 ./
drwxrwxr-x 4 shubham shubham 4.0K Jul  5 18:05 ../
-rw-rw-r-- 1 shubham shubham 236K Jul  5 18:05 chludwig_K150309_004b_SW_1_16.chrom.sqMass
-rw-rw-r-- 1 shubham shubham 236K Jul  5 18:05 chludwig_K150309_008_SW_1_4.chrom.sqMass
-rw-rw-r-- 1 shubham shubham 256K Jul  5 18:05 chludwig_K150309_013_SW_0.chrom.sqMass

(base) shubham@shubham-desktop:~/DIAlignR$ git checkout 1db6d3dd4044b45549af542165ea16a75868076e
Note: checking out '1db6d3dd4044b45549af542165ea16a75868076e'.

You are in 'detached HEAD' state. You can look around, make experimental
changes and commit them, and you can discard any commits you make in this
state without impacting any branches by performing another checkout.

If you want to create a new branch to retain commits you create, you may
do so (now or later) by using -b with the checkout command again. Example:

  git checkout -b <new-branch-name>

HEAD is now at 1db6d3d... [ADD] Add runType definition to getMultiPeptide

(base) shubham@shubham-desktop:~/DIAlignR$ ll -h inst/ptms/xics/
total 33M
drwxrwxr-x 2 shubham shubham 4.0K Jul  5 18:07 ./
drwxrwxr-x 4 shubham shubham 4.0K Jul  5 18:05 ../
-rw-rw-r-- 1 shubham shubham  11M Jul  5 18:07 chludwig_K150309_004b_SW_1_16.chrom.sqMass
-rw-rw-r-- 1 shubham shubham  11M Jul  5 18:07 chludwig_K150309_008_SW_1_4.chrom.sqMass
-rw-rw-r-- 1 shubham shubham  11M Jul  5 18:07 chludwig_K150309_013_SW_0.chrom.sqMass