Bioconductor / Contributions

Contribute Packages to Bioconductor
135 stars 33 forks source link

tartare - biocViews: ExperimentData, MassSpectrometryData #1286

Closed cpanse closed 5 years ago

cpanse commented 5 years ago

Update the following URL to point to the GitHub repository of the package you wish to submit to Bioconductor

Confirm the following by editing each check box to '[x]'

I am familiar with the essential aspects of Bioconductor software management, including:

For help with submitting your package, please subscribe and post questions to the bioc-devel mailing list.

bioc-issue-bot commented 5 years ago

Hi @cpanse

Thanks for submitting your package. We are taking a quick look at it and you will hear back from us soon.

The DESCRIPTION file for this package is:

Package: tartare
Type: Package
Title: Raw ground spectra recorded on Thermo Fisher Scientific mass
    spectrometers
Version: 0.0.1
Authors@R: c(person(given = "Christian", family = "Panse",
    email = "cp@fgcz.ethz.ch", role = c("aut", "cre"),
    comment = c(ORCID = "0000-0003-1975-3064")),
    person(given = "Tobias", family = "Kockmann",
    email = "Tobias.Kockmann@fgcz.ethz.ch", role = "aut", 
    comment = c(ORCID = "0000-0002-1847-885X")))
Depends: R (>= 3.6),
    AnnotationHub (>= 2.16),
    ExperimentHub (>= 1.0)
Imports:
    utils
Suggests:
    knitr,
    testthat
Description: provides raw files (size=215MBytes)recorded on different Liquid
  Chromatography Mass Spectrometry (LC-MS) instruments. All included MS
  instruments are manufactured by Thermo Fisher Scientific and belong to the
  Orbitrap Tribrid or Q Exactive Orbitrap family of instruments. Despite their
  common origin and shared hardware components (e.g. Orbitrap mass analyser),
  the above instruments tend to write data in different "dialects" in a shared
  binary file format (.raw). The intention behind tartare is to provide complex
  but slim real-world files that can be used to make code robust with respect
  to this diversity. In other words, it is intended for enhanced unit testing.
  The package is considered to be used with the
  rawDiag package (Trachsel, 2018 <doi:10.1021/acs.jproteome.8b00173>) and the
  Spectra MsBackends.
URL: https://github.com/cpanse/tartare
BugReports: https://github.com/cpanse/tartare/issues
Encoding: UTF-8
NeedsCompilation: no
biocViews: ExperimentData, MassSpectrometryData
RoxygenNote: 6.1.1
License: GPL-3
VignetteBuilder: knitr
Collate: 
    'zzz.R'
bioc-issue-bot commented 5 years ago

A reviewer has been assigned to your package. Learn what to expect during the review process.

IMPORTANT: Please read the instructions for setting up a push hook on your repository, or further changes to your repository will NOT trigger a new build.

bioc-issue-bot commented 5 years ago

Dear Package contributor,

This is the automated single package builder at bioconductor.org.

Your package has been built on Linux, Mac, and Windows.

On one or more platforms, the build results were: "ERROR". This may mean there is a problem with the package that you need to fix. Or it may mean that there is a problem with the build system itself.

Please see the build report for more details.

bioc-issue-bot commented 5 years ago

Received a valid push; starting a build. Commits are:

8e60133 Update DESCRIPTION Adding a Web Hook

bioc-issue-bot commented 5 years ago

Dear Package contributor,

This is the automated single package builder at bioconductor.org.

Your package has been built on Linux, Mac, and Windows.

On one or more platforms, the build results were: "ERROR". This may mean there is a problem with the package that you need to fix. Or it may mean that there is a problem with the build system itself.

Please see the build report for more details.

bioc-issue-bot commented 5 years ago

Received a valid push; starting a build. Commits are:

374f0f5 Update DESCRIPTION * ERROR: Maintainer must regis...

bioc-issue-bot commented 5 years ago

Dear Package contributor,

This is the automated single package builder at bioconductor.org.

Your package has been built on Linux, Mac, and Windows.

On one or more platforms, the build results were: "ERROR". This may mean there is a problem with the package that you need to fix. Or it may mean that there is a problem with the build system itself.

Please see the build report for more details.

bioc-issue-bot commented 5 years ago

Received a valid push; starting a build. Commits are:

48f94fe Update DESCRIPTION managed to subscribe to https:...

bioc-issue-bot commented 5 years ago

Dear Package contributor,

This is the automated single package builder at bioconductor.org.

Your package has been built on Linux, Mac, and Windows.

Congratulations! The package built without errors or warnings on all platforms.

Please see the build report for more details.

bioc-issue-bot commented 5 years ago

Received a valid push; starting a build. Commits are:

fc7c9f3 fix md5 7cdaa85 Merge branch 'master' of github.com:cpanse/tartare 67adcc8 version inc

bioc-issue-bot commented 5 years ago

Dear Package contributor,

This is the automated single package builder at bioconductor.org.

Your package has been built on Linux, Mac, and Windows.

Congratulations! The package built without errors or warnings on all platforms.

Please see the build report for more details.

bioc-issue-bot commented 5 years ago

Received a valid push; starting a build. Commits are:

de00e18 vignette cosmetics

bioc-issue-bot commented 5 years ago

Dear Package contributor,

This is the automated single package builder at bioconductor.org.

Your package has been built on Linux, Mac, and Windows.

Congratulations! The package built without errors or warnings on all platforms.

Please see the build report for more details.

bioc-issue-bot commented 5 years ago

Received a valid push; starting a build. Commits are:

918a7e2 Update DESCRIPTION

bioc-issue-bot commented 5 years ago

Dear Package contributor,

This is the automated single package builder at bioconductor.org.

Your package has been built on Linux, Mac, and Windows.

Congratulations! The package built without errors or warnings on all platforms.

Please see the build report for more details.

cpanse commented 5 years ago

@lshep, I continue once all metadata are available through ExperimentHub(). Am I on the right track?

R> library(ExperimentHub)
R> eh <- ExperimentHub(); 
snapshotDate(): 2019-09-25
R> query(eh, "tartare")
ExperimentHub with 0 records
# snapshotDate(): 2019-09-25 
R> 
lshep commented 5 years ago

Yes I will be working on uploading the data later today. sorry for the dealy

lshep commented 5 years ago

Hey a few issues before the data can be added - could you please make the following changes metadata.csv

package description:

bioc-issue-bot commented 5 years ago

Received a valid push; starting a build. Commits are:

7744cae metadata.csv [x] Please have some distinguishing ...

bioc-issue-bot commented 5 years ago

Dear Package contributor,

This is the automated single package builder at bioconductor.org.

Your package has been built on Linux, Mac, and Windows.

On one or more platforms, the build results were: "skipped, ERROR". This may mean there is a problem with the package that you need to fix. Or it may mean that there is a problem with the build system itself.

Please see the build report for more details.

bioc-issue-bot commented 5 years ago

Received a valid push; starting a build. Commits are:

591aa30 adapted table size and version inc

bioc-issue-bot commented 5 years ago

Dear Package contributor,

This is the automated single package builder at bioconductor.org.

Your package has been built on Linux, Mac, and Windows.

Congratulations! The package built without errors or warnings on all platforms.

Please see the build report for more details.

cpanse commented 5 years ago

https://github.com/Bioconductor/Contributions/issues/1286#issuecomment-543235759

@lshep done by commit 7744cae.

lshep commented 5 years ago

When I run the following I get an ERROR (this function is used to add the data to the database)

> meta = makeExperimentHubMetadata("tartare/")
missing or NA values for 'Coordinate_1_based set to TRUE'
Loading valid species information.
Error in .checkThatSourceTypeSoundsReasonable(object@SourceType) : 
  'SourceType' should be one of: BAI, BAM, BED, BigWig, BioPax, BioPaxLevel2, BioPaxLevel3, CEL, Chain, CSV, ensembl, FASTA, FASTQ, FCS, GFF, GRASP, GTF, HDF5, IDAT, Inparanoid, JSON, MTX, MySQL, mzid, mzML, mzTab, mzXML, NCBI/blast2GO, NCBI/ensembl, NCBI/UniProt, RDA, RData, Simulated, tab, tar.gz, TSV, TwoBit, TXT, UCSC track, VCF, XLS/XLSX, Zip.
 Found type: raw

Is there a different sourcetype in the list that would be appropriate rather than raw?

cpanse commented 5 years ago

@lshep This is a proprietary file format (consider it as a binary large object (BLOB)). We are going to read the file format by using the MsBackendRawfileReader and the Spectra packages in the near future.

lshep commented 5 years ago

Ok. I will add BLOB as an acceptable SourceType - could you update your metadata to reflect that please?

lshep commented 5 years ago

I updated on my end to be BLOB for the raw files and added the data to the hubs. It is now accessible using Bioc devel 3.10. Please update any necessary code and comment back here when you are ready for a review.

eh = ExperimentHub()
  |======================================================================| 100%

snapshotDate(): 2019-10-18
> query(eh, "tartare")
ExperimentHub with 4 records
# snapshotDate(): 2019-10-18 
# $dataprovider: Functional Genomics Center Zurich (FGCZ)
# $species: NA
# $rdataclass: Spectra
# additional mcols(): taxonomyid, genome, description,
#   coordinate_1_based, maintainer, rdatadateadded, preparerclass, tags,
#   rdatapath, sourceurl, sourcetype 
# retrieve records with, e.g., 'object[["EH3219"]]' 

           title                
  EH3219 | Q Exactive HF-X mzXML
  EH3220 | Q Exactive HF-X raw  
  EH3221 | Fusion Lumos mzXML   
  EH3222 | Fusion Lumos raw     
> path = query(eh, "tartare")[[1]]
downloading 1 resources
retrieving 1 resource
  |======================================================================| 100%
> path
                                      EH3219 : 3235 
"/home/lori/.cache/ExperimentHub/5f314cca58e3_3235" 
cpanse commented 5 years ago

@lshep so I replace 'raw' by 'BLOB'? C

lshep commented 5 years ago

Yes please. Then update your code to use the ExperimentHub. When you are ready for a review just tag me again and I'll have a look at the underlying code of the package. Cheers

cpanse commented 5 years ago

@lshep For whatsoever reason, the file 203c51cb0c6f_3237 seems to be corrupt on the AWS. Maybe my upload got interrupted.

cp@fgcz-113:~/__checkouts/R/tartare  (master)> md5 /Users/cp/Library/Caches/ExperimentHub/203c51cb0c6f_3237
MD5 (/Users/cp/Library/Caches/ExperimentHub/203c51cb0c6f_3237) = a4bf6ecb8adcd28ea07a034483aa1d5e

also tail -n 1 /Users/cp/Library/Caches/ExperimentHub/203c51cb0c6f_3237 looks like a incomplete mzXML file. Sorry for the trouble. C

lshep commented 5 years ago

I'm not sure how big the files are suppose to be. The file sizes we currently have

> getInfoOnIds(eh, paste0("EH", c(3219:3222)))
      ah_id fetch_id                 title rdataclass status biocversion
3089 EH3219     3235 Q Exactive HF-X mzXML    Spectra Public        3.10
3090 EH3220     3236   Q Exactive HF-X raw    Spectra Public        3.10
3091 EH3221     3237    Fusion Lumos mzXML    Spectra Public        3.10
3092 EH3222     3238      Fusion Lumos raw    Spectra Public        3.10
     rdatadateadded rdatadateremoved file_size
3089     2019-10-18             <NA>  21295960
3090     2019-10-18             <NA>  34093249
3091     2019-10-18             <NA>  46487407
3092     2019-10-18             <NA> 122609648

If you think it was your session that didn't download correctly you can do a ah("EH3221", force=TRUE) which will force redownload the file or you can reupload the file to S3 and I will replace it if the file seems to be corrupt there. I believe the credentials I sent you still should be active.

cpanse commented 5 years ago

@lshep there must be a 2nd session (after that weekend) where I uploaded one mzXML again. C

lshep commented 5 years ago

I don't have any other data in the S3 folder for tartare. Could you upload again?

cpanse commented 5 years ago

@lshep done

cp@fgcz-148:~/tartare > aws --profile AnnotationContributor s3 cp 20190716_004_PierceHeLa.mzXML s3://annotation-contributor/tatare/20190716_004_PierceHeLa.mzXML --acl public-read
upload: ./20190716_004_PierceHeLa.mzXML to s3://annotation-contributor/tatare/20190716_004_PierceHeLa.mzXML
cp@fgcz-148:~/tartare > md5sum 20190716_004_PierceHeLa.mzXML 
6f7485fc3b5864bac51a215200a52101  20190716_004_PierceHeLa.mzXML

thanks C

lshep commented 5 years ago

I updated the file on S3 - Please try to access again - The hub should recognize that its a new version of the file and download automatically. let me know if it doesn't

lshep commented 5 years ago

Hopefully that fixed that issue - In an effort to try and get this package into this release please see the other comments concerning your package. Hopefully they can be fixed so this package can be added and released with Bioc 3.9

README

vignette

R / man

general

As it could affect vignette and R/man

exec

Cheers

bioc-issue-bot commented 5 years ago

Received a valid push; starting a build. Commits are:

1b1d814 version pump

bioc-issue-bot commented 5 years ago

Dear Package contributor,

This is the automated single package builder at bioconductor.org.

Your package has been built on Linux, Mac, and Windows.

On one or more platforms, the build results were: "skipped, ERROR". This may mean there is a problem with the package that you need to fix. Or it may mean that there is a problem with the build system itself.

Please see the build report for more details.

bioc-issue-bot commented 5 years ago

Received a valid push; starting a build. Commits are:

222e4c7 initial tartar.R a5e5357 version inc

bioc-issue-bot commented 5 years ago

Dear Package contributor,

This is the automated single package builder at bioconductor.org.

Your package has been built on Linux, Mac, and Windows.

On one or more platforms, the build results were: "WARNINGS, ERROR". This may mean there is a problem with the package that you need to fix. Or it may mean that there is a problem with the build system itself.

Please see the build report for more details.

bioc-issue-bot commented 5 years ago

Received a valid push; starting a build. Commits are:

b62e77b version inc

bioc-issue-bot commented 5 years ago

Dear Package contributor,

This is the automated single package builder at bioconductor.org.

Your package has been built on Linux, Mac, and Windows.

On one or more platforms, the build results were: "WARNINGS". This may mean there is a problem with the package that you need to fix. Or it may mean that there is a problem with the build system itself.

Please see the build report for more details.

cpanse commented 5 years ago

Hopefully that fixed that issue - In an effort to try and get this package into this release please see the other comments concerning your package. Hopefully they can be fixed so this package can be added and released with Bioc 3.9

yes it fixed the md5 issue. thanks!

README

* [x]  (optional) include Bioconductor installation instructions.

DONE

vignette

* [ ]  Perhaps most of this seems like data preparation and should be in the
  `inst/scripts` desribing how the data was made. Please move this file to
  `inst/scripts`

most of the things are instrument configurations

* [ ]  I recommend still having an official vignette `tartare.Rmd` It is much
  more intuitive for a user to do `vignette("tartare")` that shows the
  ExperimentHub commands to download and access data

fully agree the instrument setups are the most important part.

R / man

* [x ]  I think you should be very explicit that running the getFilename will
  download the files associate with the package. So the users are aware there will
  be a download.

done

* [x ]  Please create a different R file to contain package code `zzz.R` file is
  generally only associated with code for `onLoad/onAttach`.

right; done

general

As it could affect vignette and R/man

* [ ]  Would there be a usecase to show the file names associated with this
  without downloading?

the package is meant to be used together with the Spectra and MsBackendRawfilereader. I added a picture in the README @jorainer presented today on the #SWEMSA conference.

exec

* [x]  Please remove and then `.gitignore` this directory. Important for you but
  not for anyone else.

DONE

Cheers

Iam going to fix the warnings later today.

best wishes,C

lshep commented 5 years ago

Awesome. I'll wait for the final corrections of the warnings but I don't see anything else preventing acceptance.

bioc-issue-bot commented 5 years ago

Received a valid push; starting a build. Commits are:

60bb902 version inc d0aded6 eliminate warnings

lshep commented 5 years ago

Hang tight - we are experiencing some delay with the builder. We are in the process of fixing it.

bioc-issue-bot commented 5 years ago

Dear Package contributor,

This is the automated single package builder at bioconductor.org.

Your package has been built on Linux, Mac, and Windows.

On one or more platforms, the build results were: "WARNINGS, ERROR". This may mean there is a problem with the package that you need to fix. Or it may mean that there is a problem with the build system itself.

Please see the build report for more details.

bioc-issue-bot commented 5 years ago

Received a valid push; starting a build. Commits are:

434709c windows warnings and version inc

bioc-issue-bot commented 5 years ago

Dear Package contributor,

This is the automated single package builder at bioconductor.org.

Your package has been built on Linux, Mac, and Windows.

Congratulations! The package built without errors or warnings on all platforms.

Please see the build report for more details.

cpanse commented 5 years ago

Awesome. I'll wait for the final corrections of the warnings but I don't see anything else preventing acceptance.

DONE. no more warnings. thanks. C