ccb-hms / ProteinGymR

Programmatic access to ProteinGym datasets in R/Bioconductor
0 stars 0 forks source link

First edits #1

Closed tram-nguyen-n closed 1 month ago

tram-nguyen-n commented 1 month ago

Hi @lgeistlinger would you mind giving the package a first-look and provide me with any edits you may have please?

lgeistlinger commented 1 month ago

Thanks @tram-nguyen-n - a couple of notes on the README:

lgeistlinger commented 1 month ago

Some more:

lgeistlinger commented 1 month ago

A couple of notes on the DESCRIPTION file:

lgeistlinger commented 1 month ago
lgeistlinger commented 1 month ago

Add installation instructions to README (for the moment only describe how to install the package from github) ...

lgeistlinger commented 1 month ago

There is currently no vignette. Bioconductor packages are required to have a vignette.

lgeistlinger commented 1 month ago

Functions names are supposed to start lower case according to Bioconductor coding style. Consider renaming:

lgeistlinger commented 1 month ago

Function name mismatch in DMS_substitutions.R:

lgeistlinger commented 1 month ago

DMS_substitutions currently returns a list of tibbles. AlphaMissense_scores returns an ordinary data.frame. Is there a strong argument for returning tibble and thus requiring the extra dependency? My preference would be for DMS_substitutions to return a list of ordinary data.frames.

lgeistlinger commented 1 month ago

' Load AlphaMissense scores for ProteinGym variants -> #' AlphaMissense scores for ProteinGym variants

' Load ProteinGym DMS Substitution Scores -> #' ProteinGym Deep Mutational Scanning (DMS) Scores

lgeistlinger commented 1 month ago

You don't need to link the bulky reference in details here. Just say: "See references."

Same here.

tram-nguyen-n commented 1 month ago

Thanks for the edits @lgeistlinger! Incorporated them all.

For this comment:

Functions names are supposed to start lower case according to Bioconductor coding style. Consider renaming:

  • AlphaMissense_scores -> am_scores
  • DMS_substitutions -> dms_scores

I kept it dms_substitutions to distinguish it from the DMS indel scores that are provided in ProteinGym too.

I can overwrite the list object on Azure to be data.frames instead of tibbles after Aaron and Pascal's edits.

tram-nguyen-n commented 1 month ago

changes incorporated