Bioconductor / Contributions

Contribute Packages to Bioconductor
133 stars 33 forks source link

rifiComparative #2718

Closed lyoussar closed 1 year ago

lyoussar commented 2 years ago

Update the following URL to point to the GitHub repository of the package you wish to submit to Bioconductor

Confirm the following by editing each check box to '[x]'

I am familiar with the essential aspects of Bioconductor software management, including:

For questions/help about the submission process, including questions about the output of the automatic reports generated by the SPB (Single Package Builder), please use the #package-submission channel of our Community Slack. Follow the link on the home page of the Bioconductor website to sign up.

bioc-issue-bot commented 2 years ago

Hi @lyoussar

Thanks for submitting your package. We are taking a quick look at it and you will hear back from us soon.

The DESCRIPTION file for this package is:

Package: rifiComparative
Title: 'rifiComparative' compares the output of rifi from 2 different conditions
Version: 0.99.0
Authors@R: c(person("Loubna", "Youssar",
      email = "loubna.youssar@biologie.uni-freiburg.de",
      role = c("aut", "cre")),
      person("Jens", "Georg",
      email = "jens.georg@biologie.uni-freiburg.de",
      role = "aut", "cre"))
Description: 'rifiComparative' is a continuation of rifi package. It compares two conditions output of rifi using half-life and mRNA at time 0 segments. As an input for the segmentation, the difference between half-life of both condtions and log2FC of the mRNA at time 0 are used. The segmentation 'rifiComparative' provides segmentation, statistics, summary table, fragments visualization and some additional useful plots for further anaylsis. 
Depends:
    R (>= 4.1)
Imports:
    cowplot,
    doMC,
    parallel,
    dplyr,
    egg,
    foreach,
    ggplot2,
    ggrepel,
    graphics,
    grDevices,
    grid,
    methods,
    nnet,
    rlang,
    S4Vectors,
    scales,
    stats,
    stringr,
    tibble,
    rtracklayer,
    utils,
    writexl,
    DTA,
    LSD,
    reshape2,
    SummarizedExperiment
Suggests:
    DescTools,
    ggrepel,
    knitr,
    rmarkdown,
    BiocStyle
VignetteBuilder: knitr  
biocViews: RNASeq, DifferentialExpression, GeneRegulation, Transcriptomics, Microarray, Software
BugReports: https://github.com/CyanolabFreiburg/rifiComparative
License: GPL-3 + file LICENSE
Encoding: UTF-8
Roxygen: list(markdown = TRUE)
RoxygenNote: 7.1.12
Language: en-US
vjcitn commented 2 years ago

pre-check is passed but think about user effort to do

df_comb_minimal[,"distance_int"] <- df_comb_minimal[,"logFC_int"]

pen_int <- make_pen(
    probe = df_comb_minimal,
    FUN = rifiComparative:::fragment_inty_pen,
    cores = 60,
    logs = as.numeric(rep(NA, 8)),
    dpt = 1,
    smpl_min = 10,
    smpl_max = 50,
    sta_pen = 0.5,
    end_pen = 4.5,
    rez_pen = 9,
    sta_pen_out = 0.5,
    end_pen_out = 3.5,
    rez_pen_out = 7
)

from vignette. are the defaults good enough?

lyoussar commented 2 years ago

So far the defaults parameters were used in several organisms without any need for adjustment. Nevertheless I made the vignette simpler and I changed the cores number.

bioc-issue-bot commented 2 years ago

A reviewer has been assigned to your package. Learn what to expect during the review process.

IMPORTANT: Please read this documentation for setting up remotes to push to git.bioconductor.org. It is required to push a version bump to git.bioconductor.org to trigger a new build.

Bioconductor utilized your github ssh-keys for git.bioconductor.org access. To manage keys and future access you may want to active your Bioconductor Git Credentials Account

bioc-issue-bot commented 2 years ago

Dear Package contributor,

This is the automated single package builder at bioconductor.org.

Your package has been built on Linux, Mac, and Windows.

On one or more platforms, the build results were: "ERROR". This may mean there is a problem with the package that you need to fix. Or it may mean that there is a problem with the build system itself.

Please see the build report for more details. This link will be active for 21 days.

Remember: if you submitted your package after July 7th, 2020, when making changes to your repository push to git@git.bioconductor.org:packages/rifiComparative to trigger a new build. A quick tutorial for setting up remotes and pushing to upstream can be found here.

lyoussar commented 2 years ago

The package could not be built in Windows. I added a file called .BBSoptions Thanks

lshep commented 2 years ago

Please also correct the ERROR in the build report and trigger a new report

Get Outlook for iOShttps://aka.ms/o0ukef


From: Loubna Youssar @.> Sent: Friday, July 22, 2022 2:34:20 PM To: Bioconductor/Contributions @.> Cc: Subscribed @.***> Subject: Re: [Bioconductor/Contributions] rifiComparative (Issue #2718)

The package could not be built in Windows. I added a file called .BBSoptions Thanks

— Reply to this email directly, view it on GitHubhttps://secure-web.cisco.com/1i5Qb0HBWgnxB6AHqOr6aM7TgXq36TUV-RtcDpdIYkhmJZHfQleHaFFsnvXHPH0ZtN10bl5_oCnC-WoAGBwhuspP0iaXAtIE1E8iKbyeVgpz65VUStMecfon1OtVBJqZk76Xq3aK2tqItT9sBpES1-02Dd2OyFmzJzZIhNN-ys31O3_K1fjHEO4S-ia8mprTXC8dgK297V75MhLTYufnknOZo-NxwoEHZtOaP5fyiFjeoMLObu-19oMfeZpZIvhYBQP4bNUtAfNwhbkeompwjixtGmxe9IZRV2r74lnq4BwlQXGasinchW6hzUsGXMHrZ/https%3A%2F%2Fgithub.com%2FBioconductor%2FContributions%2Fissues%2F2718%23issuecomment-1192838276, or unsubscribehttps://secure-web.cisco.com/11El2001sxMIcvYarqELvfpZVbSvksaJxsWQwqip3liSQOk4HytFAQ6Qjbgx8qsrB2BXD0EWcyFSFszE026J55PGP_Fw_SyHEk6MFUp_zitkLATBC7Q70yGC2j5_g0fH3SUSzHYg2Iuhm65e1IDRJiK1s9tdsxWU0oIxy0HSz-QVML7Uq-oj7bUt4j5t_2TIY-gVyUYnYUUHfaOYCDRH1RP9x5qBIzeWogVSPk4BWsqjAoZ-lajDoxoo1x2YdRONAATCFXcs55BjQxQLBeBGcjaIxnx_Sf1-P76Wzz9znAyMC9s-1SE5AyN-aWCYsKJbz/https%3A%2F%2Fgithub.com%2Fnotifications%2Funsubscribe-auth%2FAEO3MHHH2MJVORQQYIZTIMTVVLSSZANCNFSM53UXKVJQ. You are receiving this because you are subscribed to this thread.Message ID: @.***>

This email message may contain legally privileged and/or confidential information. If you are not the intended recipient(s), or the employee or agent responsible for the delivery of this message to the intended recipient(s), you are hereby notified that any disclosure, copying, distribution, or use of this email message is prohibited. If you have received this message in error, please notify the sender immediately by e-mail and delete this email message from your computer. Thank you.

lyoussar commented 2 years ago

I am subscribed to both lists: https://support.bioconductor.org/accounts/signup/ https://stat.ethz.ch/mailman/listinfo/bioc-devel I wonder why do I get these errors!? I tried to subscribe again just in case of and I ve got the email below confirming my subscription. An attempt was made to subscribe your address to the mailing list bioc-devel@r-project.org. You are already subscribed to this mailing list.

PS: all package developer should be subscribed? I guess only the maintenair right? My email address as maintenair appears as last line in READ.ME

lyoussar commented 2 years ago

Its solved, no errors neither warnings so far.

HelenaLC commented 2 years ago

First round of comments below! Note that many are suggestions and not strictly
required (e.g., "consider..."). Please comment back here with what has/not been
addressed how/why not, or any comments regarding any needed clarification or
questions you might have; happy to help/discuss. Cheers!

documentation

vignette

because of this (where logs is undefined):

logs[c( paste0(names(tmp)[1], "_penalty"), paste0(names(tmp)[1], "_outlier_penalty") )] <- c(res1[1], res1[2])



# code

- [ ] In *plot_histogram.r*, avoid `sapply()` and use `vapply()` instead.
- [ ] In *empty_data_negative.r*, avoid `print()` and use `message()` instead,  
ideally with a `verbose = FALSE/TRUE` function argument, so that users have  
control over whether or not additional information is printed to the console.
- [ ] `figures_fun()` currently writes all output plots to the current  
working directory, with fixed files names (specified internally). Instead,  
I'd encourage exposing a `dir` (directory) argument so that users can specify  
where plots should be saved to. In addition, it would be nice having an option  
`dir = NULL` where plots are returned as R objects (`ggplot`s) that can be  
saved to a variable, modified further, and saved in a custom way.
- [ ] Related to the above, it would also be nice to export the different  
plotting functions (with separate examples) such that users can generate  
plots of interest rather than having to generate all of them `figure_fun()`.
- [ ] In *plot_scatter.r*, `geom_smooth()` gives a message  
that can be avoided by specifying `formula = y ~ x`.
- [ ] In *plot_histogram.r*, `reshape2::melt()` gives a message  
that can be avoided by specifying `id = "category"`.
- [ ] Some function examples are quite costly to run (2min+ on my laptop).  
Not sure if the example data size could be decreased, but in any case,  
it might be mostly due to inefficient implementation.  
I'd highly recommend reading up on [Robust and Efficient R Code](https://contributions.bioconductor.org/r-code.html?q=lapply#robust-and-efficient-r-code).  
E.g., there's a 3-level nested `for`-loop in `make_pen()` that seams worth  
re-implementing, especially as its called repeatedly by other functions.

# other

- [ ] Consider adding a NEWS file to track package updates;  
these will also be included in Bioconductor release announcements.
- [ ] Consider adding unit tests; we strongly encourage them!  
See https://contributions.bioconductor.org/tests.html
- [ ] Consider adding a package *.man* page (i.e., `?rifiComparative`)  
so users would be directed to the main functions of the package.
- [ ] The raw package directory should not contain unnecessary files, system  
files, or hidden files such as \*.Rproj. These files may be present in your  
local directory but should not be commited to git (e.g., using .gitignore).
lyoussar commented 2 years ago

First round of comments below! Note that many are suggestions and not strictly required (e.g., "consider..."). Please comment back here with what has/not been addressed how/why not, or any comments regarding any needed clarification or questions you might have; happy to help/discuss. Cheers!

Thanks for your comments, many of them are very helpful. I tried to address all questions/suggestions including or correcting issues. [x] means solved/corrected.
Best, Loubna

documentation

  • [x] Typo in _makepen.r line 47: vetor -> vector
  • [x] Also in _makepen.r: Documentation says default of dpt is 2, but it's 1.
  • [x] Generally, @importFrom is encouraged over importing an entire package with @import. If there are many functions from a single package, @import can be acceptable. However, you are currently importing all of dplyr,ggplot2,SummarizedExperiment,scales,writexl,DRA,ggrepel,LSD, which would definitely be worth reconsidering to avoid namespace conflicts.

You are right, once many functions are from the same package, they are imported as importFrom e.g. “utils write.table read.delim2 capture.output” but the imported packages are individuals.

*[x] The elements and data structure of example data is well described, which is appreciated! However, please also include details on the source information, or how the example data was generated.

  **Thanks for this comment. The source information is indicated in ReadMe and on the vignette with the corresponding link of the package where the data was generated 
  “RifiComparative is a workflow for a comparative data, output of Rifi framework (https://www.bioconductor.org/packages/release/bioc/html/rifi.html”).
  Now I changed the link of github to bioconductor.**
  • [x] Regarding function .man (help) pages, I think there's lots of ill formatting due to wrong usage of roxygen2. To pick just one example ?penalties has the function description as a header. Instead, this should go in @description or under @details with a short, descriptive title under @title (see here for details); here's one example (but all help pages I checked looked somewhat off):
#' @rdname make_pen
#' @title Penalty assignment
#' @description
#'   `make_pen` automatically assigns a penalties.
#'   `make_pen` calls one of four available penalty functions
#'   to automatically assign penalties for the dynamic programming. 
#'   The two functions to be called are: ...

Yes, you are right. There were some odd documentations. I revised and set all of them making a convenient structure for the user.

vignette

  • [ ] Please use a less generic name (i.e., not vignette.Rmd) to make less ambiguous to locate with, e.g., vignette() I did not get this point, sorry!

  • [x] line 55: resultinf -> resulting

  • [x] Throughout the vignette, I'd suggest data frame and dataframe -> data.frame I used the ... to indicate functions and it might be confusing if I use it somewhere else. However I make data frame all homogeneous with space in the middle.

  • [x] To highlight / distinguish code from normal text, I'd recommend consistently using back-ticks. Do you refer to the vignette? All codes are between the 3 back-ticks or do you mean something else?

  • [ ] Similarly, to direct users to relevant external packages and highlight/make them distinguishable from general code, you can hyperlink them via BiocStyle::CRANpkg("package_name") (or Githubpkg or Biocpkg, depending on where the package sits). Could you please specify where the changes should be applied?

  • [x] Code chunks "make_pen HL" and "make_pen int" have eval = FALSE but fail because argument logs does not have a default value (see below).

# make_pen() fails with:
Error in logs[...: 
object of type 'symbol' is not subsettable

# because of this (where logs is undefined):
logs[c(
  paste0(names(tmp)[1], "_penalty"),
  paste0(names(tmp)[1], "_outlier_penalty")
)] <- c(res1[1], res1[2])

That true, I added now and it runs like a charm.

code

  • [x] In _plothistogram.r, avoid sapply() and use vapply() instead. Vapply does not provide the output I need after many tries, sapply in this case is saver

  • [x] In _empty_datanegative.r, avoid print() and use message() instead, ideally with a verbose = FALSE/TRUE function argument, so that users have control over whether or not additional information is printed to the console.

  • [] figures_fun() currently writes all output plots to the current working directory, with fixed files names (specified internally). Instead, I'd encourage exposing a dir (directory) argument so that users can specify where plots should be saved to. In addition, it would be nice having an option dir = NULL where plots are returned as R objects (ggplots) that can be saved to a variable, modified further, and saved in a custom way.

  • [x] Related to the above, it would also be nice to export the different plotting functions (with separate examples) such that users can generate plots of interest rather than having to generate all of them figure_fun(). The user can always run each plot separately just by reading the documentation of figure_fun(). He can still select at the end the plots he need, they are not computationally expensive and generated in less than a minutes independently how big are the data.

  • [x] In _plotscatter.r, geom_smooth() gives a message that can be avoided by specifying formula = y ~ x. They are suppressed by now.

  • [x] In _plothistogram.r, reshape2::melt() gives a message that can be avoided by specifying id = "category". I tried it before submitting and did not work

  • [x] Some function examples are quite costly to run (2min+ on my laptop). Not sure if the example data size could be decreased, but in any case, it might be mostly due to inefficient implementation. I'd highly recommend reading up on Robust and Efficient R Code. E.g., there's a 3-level nested for-loop in make_pen() that seams worth re-implementing, especially as its called repeatedly by other functions. There are two functions which are computationally expensive. Penalties and fragmentation. These two functions make many iterations to find the best model so it would be impossible to make a smaller example as it will not show a reasonable output.

other

  • [x] Consider adding a NEWS file to track package updates; these will also be included in Bioconductor release announcements.
  • [ ] Consider adding unit tests; we strongly encourage them! See https://contributions.bioconductor.org/tests.html
  • [x] Consider adding a package .man page (i.e., ?rifiComparative) so users would be directed to the main functions of the package. I added this file but so far I can not make it functional!?
  • [x] The raw package directory should not contain unnecessary files, system files, or hidden files such as *.Rproj. These files may be present in your local directory but should not be committed to git (e.g., using .gitignore).
HelenaLC commented 2 years ago

Thanks for the speedy action. I can see changes have been made in the package's GH repo, however, in order for these to go through to Bioc, you need to bump the package version in the DESCRIPTION file (i.e., from 0.99.1 to 0.99.2); thanks!

lyoussar commented 2 years ago

Thanks and done!

HelenaLC commented 2 years ago

Again, I can see this on your GH. However, if a build has been triggered, you'd see a message here from the bioc-issue-bot, which is not the case. You also need to push these changes upstream to the upstream Bioc remote (not only your personal GH remote). See here for a quick tutorial on setting up remotes and pushing to upstream.

lyoussar commented 2 years ago

I followed the instructions. I forked from the home repository (CyanolabFreiburg) to mine (lyoussar) and added a remote named upstream to my package.

origin https://github.com/CyanolabFreiburg/rifi_comparative_data.git (fetch) origin https://github.com/CyanolabFreiburg/rifi_comparative_data.git (push) upstream git@git.bioconductor.org:packages/git@github.com:lyoussar/rifiComparative.git (fetch) upstream git@git.bioconductor.org:packages/git@github.com:lyoussar/rifiComparative.git (push)

Please let me know if that is enough/correct.

Thanks!

HelenaLC commented 2 years ago

Yes, that all looks correct. Now, when you want to make changes, you should push to both the origin and upstream remote. I.e., commit -m “…” — push origin — push master.

lyoussar commented 2 years ago

Great, Thanks!

HelenaLC commented 2 years ago

Did you try doing this? The package being built here is still v0.99.0 whereas your Github repo DESCRIPTION has v0.99.2.

lyoussar commented 2 years ago

I just built the package again with the last version and I get 0 errors, 0 warnings and few notes. Is that what you referring to?

HelenaLC commented 2 years ago

Hm. I can only see a single build report from 21 days ago here. Did you push upstream as explained here? I.e., something like

git add ...
git commit -m "..."
git push origin   # push to your Github repo
git push upstream # push to the Bioc server
lyoussar commented 2 years ago

I have being struggling the last 2 hours to find a solution to that:

git fetch --all Fetching origin Fetching upstream git@git.bioconductor.org: Permission denied (publickey). fatal: Could not read from remote repository.

Please make sure you have the correct access rights and the repository exists. error: Could not fetch upstream

This is the remote:

git remote -v origin https://github.com/CyanolabFreiburg/rifi_comparative_data.git (fetch) origin https://github.com/CyanolabFreiburg/rifi_comparative_data.git (push) upstream git@git.bioconductor.org:packages/git@github.com:lyoussar/rifiComparative.git (fetch) upstream git@git.bioconductor.org:packages/git@github.com:lyoussar/rifiComparative.git (push)

Reading somewhere on the email list, I found out this answer from a reviewer: @nturaga I think this is because I changed the contents of https://github.com/kozo2.keys after creating this issue. Could you please check the SSH public key registration again?

Could u help me please? is it related to bioconductor? me?

Thanks

lshep commented 2 years ago

Keys should update automatically once changed.

It looks like your upstream is set incorrectly.

Yours is currently set: git@git.bioconductor.org:packages/git@github.com:lyoussar/rifiComparative.git

Where it has an extra git@github.com:lyoussar

Please do

git remote remove upstream

git remote add upstream git@bioconductor.org:packages/rifiComparative.git git remote fetch --all

and try again.

If you still have trouble please remember to activate your GitCredentials account and perhaps update with a new ssh-key.

lyoussar commented 2 years ago

I set a new key and activate the GitCredentials, I built again the package and I am still not getting access, I wonder what I do miss.

git remote -v origin git@github.com:lyoussar/rifiComparative.git (fetch) origin git@github.com:lyoussar/rifiComparative.git (push) upstream git@git.bioconductor.org:packages/rifiComparative.git (fetch) upstream git@git.bioconductor.org:packages/rifiComparative.git (push)

git fetch --all Fetching origin ECDSA host key for IP address '140.82.121.36' not in list of known hosts. Fetching upstream git@git.bioconductor.org: Permission denied (publickey). fatal: Could not read from remote repository.

Please make sure you have the correct access rights and the repository exists. error: Could not fetch upstream

lshep commented 2 years ago

What are the results of ssh -T git@git.bioconductor.org

lshep commented 2 years ago

Please also check out numbers 13, 14, and 15 of http://contributions.bioconductor.org/git-version-control.html#faq

lyoussar commented 2 years ago

ssh -T git@git.bioconductor.org git@git.bioconductor.org: Permission denied (publickey).

lshep commented 2 years ago

Check out 14/15 of the faqs link above -- did you call the file with your key a different name?

lyoussar commented 2 years ago

Thanks again for your support.

I went through all these faqs, changing the config file of ssh ect.. and I get the same answer from ssh -T git... Then I wanted to check my BiocCredentials to see what SSH key I registered. I was unable to access my account, I tried to generate a new password but so far I did not receive any email neither in the spam box. Checking in Internet I found this issue very frequent and sometimes the account is reactivated by the same reviewer like in this case (https://www.mail-archive.com/search?l=bioc-devel@r-project.org&q=subject:%22Re%5C%3A+%5C%5BBioc%5C-devel%5C%5D+Problems+with+BiocCredentials%22&o=newest&f=1)

May I ask u to activate it for me? I am still did not receive any email yet.

Thank you again

nturaga commented 2 years ago

Hi @lyoussar

Just stepping in here to add my 2 cents.

You have already activated your BiocCredentials account. Once it is active, the activation does NOT work.

You need to now log in with the password you set, or say "reset password". If that did not work, it's possible you are using the wrong email? The email attached your id is loubna.youssar@biologie.uni-freiburg.de.

The other thing I notice is your account has 3 keys. I'm pretty confident that you are not matching the correct key to what is available on the Bioconductor server. The key's work as a public <-> private key pair. You have to match the private key (which only lives on your local machine), to the public key which you gave to Bioconductor.

It's possible to match keys using something like this, that generates a public key from the private key you have locally.

ssh-keygen -y -f ~/.ssh/id_rsa

Best,

nturaga commented 2 years ago

Another note, sometimes when users don't get the email from BiocCredentials, it is because their "institute" has firewalls. This is a problem we are unable to get around and is very institute specific. A way to go around it is to use an email address that is not administered by an institute. But you should do this based on what the best option is for you.

lyoussar commented 2 years ago

Bingo! The email address was the issue of the whole story!! I changed it to my private email, set the key and everything works as a charme. Here is the confirmation:

git fetch --all Fetching origin Fetching upstream From git.bioconductor.org:packages/rifiComparative

Let me know if I need to set up anything else. Thanks!

HelenaLC commented 2 years ago

Could you try pushing your recent changes upstream (with a version bump to trigger a new build)? The one and only available build report is still v0.99.0 (dated July 22nd).

bioc-issue-bot commented 2 years ago

Received a valid push on git.bioconductor.org; starting a build for commit id: 99e8a0c6b060abe3792e5d5f36cb01f7877ebecb

bioc-issue-bot commented 2 years ago

Dear Package contributor,

This is the automated single package builder at bioconductor.org.

Your package has been built on Linux, Mac, and Windows.

Congratulations! The package built without errors or warnings on all platforms.

Please see the build report for more details. This link will be active for 21 days.

Remember: if you submitted your package after July 7th, 2020, when making changes to your repository push to git@git.bioconductor.org:packages/rifiComparative to trigger a new build. A quick tutorial for setting up remotes and pushing to upstream can be found here.

HelenaLC commented 2 years ago

vignette

documentation

code

lyoussar commented 2 years ago

Thanks again for your help and notes. I went through each statement and here are my comments:

vignette

  • [X] Regarding "a less generic name": All I meant was to rename vignettes/vignette.Rmd to something else, e.g., vignettes/rifiComparative.Rmd. Solved

  • [X] Regarding code highlighting: Yes, function names are generally enclosed in back-ticks. However, this might also be worth doing for anything else in R (i.e., variable names, packages, ...). To give two examples: The introduction talks about the package rifiComparative; and, under "2. Penalties", you refer to variables distance_HL, df_comb_minimal, ... computed above. It's just easier to read when one can easily distinguish between what is and what is not R-related. rifiComparative package is a successor of rifi (Bioconductor package). The users of this package are forced to use the predecessor one for the input. Many functions are used in both packages and are commented and named same for reference. Therefore I can not change neither the name or add more comments unless if they are necessary.

  • [X] Regarding hyperlinking packages: E.g., "from dplyr package" (line 80) could be written as: "from the r BiocStyle::CRANpkg("dplyr") package", which i) renders a clickable hyperlink to dplyr; and, ii) automatically highlights/distinguishes package names from normal back ticks. This is but one example, but I'd recommend it for anywhere you mention a package (there are similar BiocStyle::Xpkg() functions for, e.g., packages on Github, Bioconductor, ...). Solved

documentation

  • [X] The documentation looks all right now, thank you. On another note: I think it's not necessary (and perhaps extra maintenance) to mention each argument's default value under @param (e.g., ?make_pen); these are visible under the Usage section already. Instead, it would be of more interest to the user to explain under Details i) how these defaults were selected; and, ii) what impact lower/higher values might have or on what scale these should be etc.

Same answer as above: rifiComparative package is a successor of rifi (Bioconductor package). The users are forced to use the predecessor one for the input. Many functions are used in both packages and are commented and named same for reference. Therefore I can not change neither the name or add more comments unless if they are necessary.

  • [X] Maybe this wasn't clear, but @import should be generally avoided, irrespective of whether few or many functions are being imported. You are still importing all of dplyr, ggplot2, SummarizedExperiment,scales, writexl, DTA, ggrepel and LSD, though it's not clear how these are used at the moment. A generally good habit is to include selective @importFroms within each function's roxygen header (e.g., before @export). This i) avoids importing entire packages;and, ii) clarifies which functions are being used from which package. Solved

  • [X] Regarding adding a package .man page (accessible via ?rifiComparative). Not sure where you added this as I cannot find it; but simply adding a R/rifiComparative.r script should do the trick without @export and with NULL after the roxygen body. This will render a normal documentation page with free-format. It can be as simple as this, or you could link key functions and summarize the general workflow etc. Solved

code

  • [X] _plothistogram.r: Sorry, it should be id.vars not id. Writing out reshape2::melt(hist_sf, id.vars = "category") will avoid the current message "Using category as id variables". Solved

  • [X] You didn't check off or comment on unit tests; just double-mentioning this because at least a little bit of unit testing is easy to implement and so so beneficial! You are right, unit tests are very useful. However and on one hand I try to keep the same style as rifi package and unit tests were not implemented. On other hand, I put in place easy and simple examples in vignette making the package user-friendly.

HelenaLC commented 1 year ago

Again, please make sure to push these changes upstream with a version bump from v0.99.2 to v0.99.3 in order to trigger a new build. The Bioc version is currently not in synch with what I am seeing on the origin remote (i.e., your GitHub).

bioc-issue-bot commented 1 year ago

Received a valid push on git.bioconductor.org; starting a build for commit id: 43908cb741eafb5e01c048593d9e30f85dce5835

bioc-issue-bot commented 1 year ago

Dear Package contributor,

This is the automated single package builder at bioconductor.org.

Your package has been built on Linux, Mac, and Windows.

On one or more platforms, the build results were: "ERROR, skipped". This may mean there is a problem with the package that you need to fix. Or it may mean that there is a problem with the build system itself.

Please see the build report for more details. This link will be active for 21 days.

Remember: if you submitted your package after July 7th, 2020, when making changes to your repository push to git@git.bioconductor.org:packages/rifiComparative to trigger a new build. A quick tutorial for setting up remotes and pushing to upstream can be found here.

bioc-issue-bot commented 1 year ago

Received a valid push on git.bioconductor.org; starting a build for commit id: fca847fce52e18f19fdcb80995cf3016d5281431

bioc-issue-bot commented 1 year ago

Dear Package contributor,

This is the automated single package builder at bioconductor.org.

Your package has been built on Linux, Mac, and Windows.

Congratulations! The package built without errors or warnings on all platforms.

Please see the build report for more details. This link will be active for 21 days.

Remember: if you submitted your package after July 7th, 2020, when making changes to your repository push to git@git.bioconductor.org:packages/rifiComparative to trigger a new build. A quick tutorial for setting up remotes and pushing to upstream can be found here.

HelenaLC commented 1 year ago

Ok, I believe all essential changes have been made. Nevertheless, I kindly ask you to consider the comments below. Please bare in mind that throughout development - that is, also after acceptance and the package being available through Bioc - you will need to update both the origin and upstream branch (with a version bump) for changes to go through to Bioconductor and be available to users upon installation using BiocManager::install().

current:

Same as rifi workflow, to get the best segmentation we need the optimal penalties. To calculate HL penalty, the difference between half-life from both conditions is calculated and added as distance_HL variable to df_comb_minimal data frame. On other hand the logFC_int is used to assign penalties for intensity values and added as distance_int variable. df_comb_minimal with the additional variables is named penalties_df.

proposition:

Same as `r BiocStyle::Biocpkg("rifi")` workflow, to get the best segmentation we need the optimal penalties. To calculate HL penalty, the difference between half-life from both conditions is calculated and added as `distance_HL` variable to `df_comb_minimal` data frame. On other hand the `logFC_int` is used to assign penalties for intensity values and added as `distance_int` variable. `df_comb_minimal` with the additional variables is named `penalties_df`.

lyoussar commented 1 year ago
  • Please remember to update the NEWS file to track package changes and updates. Currently, only version 0.99.1 is listed, though the package is currently at 0.99.4. Done
  • Regarding parameter defaults in function documentation: Again, I don't understand your response; could you clarify? My suggestion was not to alter the default parameter values or function definition, but to remove the listing of these from the man page to be more brief and avoid redundancy (since these values can be seen under the usage section). E.g., in make_pen's smpl_min: "integer: the smaller end of the sampling size. (Default is 10. -> drop this)". Secondly, as mentioned previously, I would ask you to elaborate on i) how these defaults were selected; and, ii) what impact lower/higher values might have or on what scale these should be etc., since these values seem fairly arbitrary otherwise and it is not clear for the user what effect changes to the defaults will have.

The first part is corrected, all defaults values were removed. The second part: I did answer this question before but seems I did not make it clear. make_pen is a function from rifi package, many values were tested on different organisms and different sequences platform before getting to the default values. The user can use the default values unless if he wishes to have a different result, in this case, genome segmentation then he can read all details on the vignette cited on rifi package. P.S: rifiComparative could not be used before using rifi as the rifi outputs are the rifiComparative inputs. I assume the user has a clear idea about the rifi steps prior running rifiComparative.

  • Regarding code highlighting in the vignette: I don't understand what this has to do with rifiComparative being a successor of rifi. I simply suggested to use single back ticks around variable, function, package names etc. to distinguish R code from normal text. At the moment, the code-style is / is not being used inconsistently, and using it throughout would improve overall readability and understandably of the vignette. For example:

current:

Same as rifi workflow, to get the best segmentation we need the optimal penalties. To calculate HL penalty, the difference between half-life from both conditions is calculated and added as distance_HL variable to df_comb_minimal data frame. On other hand the logFC_int is used to assign penalties for intensity values and added as distance_int variable. df_comb_minimal with the additional variables is named penalties_df.

proposition:

Same as r BiocStyle::Biocpkg("rifi") workflow, to get the best segmentation we need the optimal penalties. To calculate HL penalty, the difference between half-life from both conditions is calculated and added as distance_HL variable to df_comb_minimal data frame. On other hand the logFC_int is used to assign penalties for intensity values and added as distance_int variable. df_comb_minimal with the additional variables is named penalties_df.

Thanks for making it clear. All variable, function, package names are now with backticks.

bioc-issue-bot commented 1 year ago

Received a valid push on git.bioconductor.org; starting a build for commit id: 6e23d614d010e253de585443f9459d55dfe626a5

bioc-issue-bot commented 1 year ago

Dear Package contributor,

This is the automated single package builder at bioconductor.org.

Your package has been built on Linux, Mac, and Windows.

Congratulations! The package built without errors or warnings on all platforms.

Please see the build report for more details. This link will be active for 21 days.

Remember: if you submitted your package after July 7th, 2020, when making changes to your repository push to git@git.bioconductor.org:packages/rifiComparative to trigger a new build. A quick tutorial for setting up remotes and pushing to upstream can be found here.

HelenaLC commented 1 year ago

Okay, thanks for making a couple adjustments and clarifying the defaults. However, even in rifi's vignette, I couldn't yet find an explanation for how the defaults / parameters used were selected; and,make_pen() is called with smpl_max = 18 (not 100, as in rifiComparative); and, in general, it is sub-optimal for users to have to think of searching another (even if related) package's vignette: the current package's documentation should be self-explanatory (i.e., in itself or including relevant pointers to external sources).

To expand a bit on my current confusion (giving one example):

...all that is to point out that it would be appreciated if you could expand on the documentation such that it is made clear how the two packages interact, and (if not in rifiComparative itself) where to find additional information on rifi-specific parts of the package.

lyoussar commented 1 year ago

smpl_max = 18 was used only in rifi package and precisely in vignette for the mini sample generated to make the vignette computationally fast. The functions used in rifi are all computationally expensive and if the sample is a bit larger, would take hours. Therefore a mini instance was created and the value 18 was used. The sample can never be a real sample because it does not make any sense.

Regarding your question if make_pen in rifi and rifi_comparative are same. Yes but with a very thinny adjustment, in penalties function where make_pen is used, I changed the parameter smpl_max to 50 since I except a smaller training sample compared to rifi. The values used in rifiComparative to train the data are less noisy compared to rifi. If you check the penalties function, I am not making the parameters flexible to the user since they is no need. Only data and cores number are required. As the user would not need to change the parameters I am not sure if expanding the documentation for make_pen in rifiComparative would bring more help or confusiones.

lyoussar commented 1 year ago

Any update?

HelenaLC commented 1 year ago

My apologies, I have been quite unsure how to respond/proceed. Perhaps you could help clarify. You stated that rifiComparative is a successor to rifi. Does this mean the latter will be deprecated? If not, I do not understand why there would be a near copy-paste of functions between both packages, with i) some minor differences; ii) documentation scattered across different places; and, iii) unclarity regarding how the packages are designed to interoperate, which package's function to use, if it makes a difference or not etc. Overall, having two packages inter-operate is great. But having repeated functionality (even code) is not. Perhaps this is a misunderstanding on my end; if so, I'd be grateful if you could explain the above points (if rifi is to be continued).

lyoussar commented 1 year ago

I will try this time to be explicit and I hope I can clarify the idea. rifi package was developed with X functions. rifiComparative compares the outputs of rifi using XX functions. Both packages have different functions but share one important approach which is genome segmentation using one algorithm. This algorithm was developed by us spending quite a long time to make it efficient computationally and biologically. This algorithm could be used in any package requiring genome segmentation/fragmentation. The only need is to adapt the input and slightly the code to have the same efficiency. rifiComparative contains many functions developed exclusively for it and at certain step reuses that algorithm to accomplish one crucial step essential to make the package functional and efficient. I hope I was able to translate better the idea behind rifi and rifiComparative and I am available to provide more details if necessary.