Proposed contribution task for Outreachy applicants: Register NCBI assembly UCB_Xtro_10.0

hpages commented 2 years ago

UCB_Xtro_10.0 is a Western clawed frog (Xenopus tropicalis) assembly available at NCBI: https://www.ncbi.nlm.nih.gov/assembly/GCF_000004195.4/

Note that UCB_Xtro_10.0 is the assembly that xenTro10, the latest UCSC genome for the Western clawed frog, is based on. See "List of UCSC genome releases" at https://genome.ucsc.edu/FAQ/FAQreleases.html for all the genomes currently supported by UCSC.

Also check out the "Genome Browser Gateway" page here. This is the main entrance to the "UCSC Genome Browser". Find the Western clawed frog in the UCSC species tree on the left, click on it, then make sure to select the latest X. tropicalis Assembly (xenTro10). This will display a bunch of additional information about the xenTro10 assembly. In particular, it will indicate what NCBI assembly this genome is based on. This information is the Accession ID field. This field is usually set to a GenBank (GCA_000*.*) or RefSeq (GCF_000*.*) accession number.

Note that many NCBI assemblies are already registered in the GenomeInfoDb package (223 as of October 2022!). The registered_NCBI_assemblies() function in GenomeInfoDb returns the list of all the NCBI assemblies that are currently registered in the package. An important thing to be aware of is that getChromInfoFromNCBI() still works on an unregistered assembly, but in "degraded" mode, that is:

The name of the assembly is not recognized, only look up by GenBank or RefSeq accession works.
The returned circularity flags are not guaranteed to be accurate. This potential inaccuracy is communicated to the user by placing NAs instead of FALSEs in the circular column of the returned data.frame.

Registering an assembly fixes that. In other words, once an NCBI assembly is registered in GenomeInfoDb, getChromInfoFromNCBI() will recognize its name and return accurate circularity flags.

See ?getChromInfoFromNCBI (after loading GenomeInfoDb) for more information.

Registering a new NCBI assembly for an organism that is already supported is only a matter of editing the corresponding file in GenomeInfoDb/inst/registered/NCBI_assemblies/. If this is a new organism, then we need to start a new file. See the other files for the naming scheme: the name of the file must be the full scientific name of the organism, with the underscore used as separator, and with the first letter capitalized. Extension must be .R.

IMPORTANT NOTES TO OUTREACHY APPLICANTS:

Make sure to complete all the Preliminary tasks listed here before you start working on this task. In particular, make sure that you have R 4.2 and that you are set up to use the devel version of Bioconductor (currently 3.16).
Only one applicant can work on this task. If you choose to work on this task, please make sure to assign yourself so other applicants know that the task is already being worked on. If later on you change your mind, please unassign yourself. It's ok to change your mind!
To work on this task, please fork the GenomeInfoDb repository. Then do your work on that fork.
Always test your changes before you commit them to your fork. This consists in installing the modified package, starting R, loading the package, and playing around with the new functionality. This process is called "ad hoc manual testing". Once everything behaves and looks as expected, run R CMD build and R CMD check on the package. Note that R CMD check should always be run on the source tarball produced by R CMD build.
R CMD check might produce some NOTEs and even some WARNINGs. These are ok if they existed before your changes. You can check that by taking a look at the daily report produced by our automated builds here: https://bioconductor.org/checkResults/devel/bioc-LATEST/ Make sure to not introduce new NOTEs or WARNINGs!
Once your work is ready to be merged, please submit a PR (Pull Request).
Remember to record your contribution on Outreachy at https://www.outreachy.org/outreachy-december-2022-internship-round/communities/bioconductor/refactor-the-bsgenomeforge-tools/contributions/.

Simplecodez commented 2 years ago

Hi, @hpages please can you assign me to this task?

hpages commented 2 years ago

Done. Don't hesitate to ask if you have questions.

Simplecodez commented 2 years ago

Thank you sir, I will start working on right away

Simplecodez commented 2 years ago

Good day sir please, I don't understand what I am to do exactly. I am a bit confused sir

hpages commented 2 years ago

Hi @Simplecodez ,

Can you try to formulate a more precise question? I'd be happy to provide as much clarification as needed but it'll be easier for me if I know a little bit more about what is not clear in the task description above, and whether you have tried things or not already.

Or perhaps you want to try to start with task #46 instead? This is the 1st task in your group of tasks (Frog). See list of tasks here. The current issue is the 2nd task in the group. Note that the other two applicants have chosen to start with the 1st task in their respective group (Dog and Cat), which is to register an UCSC genome in the GenomeInfoDb package. The 2nd task in each group is to register an NCBI assembly in the GenomeInfoDb package. So by choosing the 1st task in the Frog group, you'll be working on a task that is very similar to tasks #43 (Dog) and #49 (Cat). Maybe the discussions there will help you get started with issue #46?

Let me know if you want to switch.

Simplecodez commented 2 years ago

Thank you sir, I have just forked and clone the repo locally

Simplecodez commented 2 years ago

Good day sir. I would really love to contribute to this project but I don't really know what to do. I have just forked and cloned the repo but don't know the files I am to edit or change sir.

hpages commented 2 years ago

What about my suggestion to switch to #46? I think it's going to be easier for a first task.

Simplecodez commented 2 years ago

What about my suggestion to switch to #46? I think it's going to be easier for a first task.

Good day I really appreciate your patient with me and your suggestion, but I have really done alot of research on this one to back out now. I have created the Xenopus_tropicalis.R file and registered the organism but when I run R CMD check Xenopus_tropicalis.R, I get this error: Error in getOctD(x, offset, len) : invalid octal digit. I don't know why sir.

hpages commented 2 years ago

... but I have really done alot of research on this one to back out now.

Hmm.. but you understand that all the research and work you've done so far won't be in vain because you'll resume your work on this issue after you're done with issue #46 right? Anyways, it's up to you.

... but when I run R CMD check Xenopus_tropicalis.R

This is not how we use R CMD check. Please read carefully my IMPORTANT NOTES TO OUTREACHY APPLICANTS at the top of this issue.

Also it's too early to try to run R CMD check. The R CMD build and R CMD check steps are usually the very last steps before a commit. Before we run them, we need to validate our changes via some "ad hoc manual testing" (as explained in my IMPORTANT NOTES TO OUTREACHY APPLICANTS above).

In this case, the ad hoc manual testing would consist in installing the modified GenomeInfoDb package, starting a fresh R session, loading GenomeInfoDb (with library(GenomeInfoDb)), and do the following:

Check that registered_NCBI_assemblies("Xenopus tropicalis") works and returns the correct data.
Check that getChromInfoFromNCBI("UCB_Xtro_10.0") works and returns the correct data.

Simplecodez commented 2 years ago

Good day sir, I have successfully installed the edit GenomeInfo locally and tested the registered_NCBI_assemblies("Xenopus tropicalis") functionality which returns the correct data. But when I try this function getChromInfoFromNCBI("UCB_Xtro_10.0"), an error saying: Error in function (type, msg, asError = True) : could not retrieve from host: ftp.ncbi.nlm.nih.gov

hpages commented 2 years ago

Hi @Simplecodez ,

But when I try this function getChromInfoFromNCBI("UCB_Xtro_10.0"), an error saying: Error in function (type, msg, asError = True) : could not retrieve from host: ftp.ncbi.nlm.nih.gov

It seems that getChromInfoFromNCBI() was not able to access NCBI FTP site to download the "Full sequence report" for UCB_Xtro_10.0. (See here for some explanation I provided in another issue about the "Full sequence report".)

This error could happen because the site was temporarily down or because your internet connection was temporarily down. Can you check your internet connection and try again? Also please provide the output of your sessionInfo().

Thanks, H.

Simplecodez commented 2 years ago

Okay, sir. I will try again later. Thank you.

Simplecodez commented 2 years ago

Thank you sir, i just ran getChromInfoFromNCBI("UCB_Xtro_10.0") and it outputs the correct data.

This is the output of sessionInfo():

function (package = NULL) 
{
    z <- list()
    z$R.version <- R.Version()
    z$platform <- z$R.version$platform
    if (nzchar(.Platform$r_arch)) 
        z$platform <- paste(z$platform, .Platform$r_arch, sep = "/")
    z$platform <- paste0(z$platform, " (", 8 * .Machine$sizeof.pointer, 
        "-bit)")
    z$locale <- Sys.getlocale()
    z$running <- osVersion
    z$RNGkind <- RNGkind()
    if (is.null(package)) {
        package <- grep("^package:", search(), value = TRUE)
        keep <- vapply(package, function(x) x == "package:base" || 
            !is.null(attr(as.environment(x), "path")), NA)
        package <- .rmpkg(package[keep])
    }
    pkgDesc <- lapply(package, packageDescription, encoding = NA)
    if (length(package) == 0) 
        stop("no valid packages were specified")
    basePkgs <- sapply(pkgDesc, function(x) !is.null(x$Priority) && 
        x$Priority == "base")
    z$basePkgs <- package[basePkgs]
    if (any(!basePkgs)) {
        z$otherPkgs <- pkgDesc[!basePkgs]
        names(z$otherPkgs) <- package[!basePkgs]
    }
    loadedOnly <- loadedNamespaces()
    loadedOnly <- loadedOnly[!(loadedOnly %in% package)]
    if (length(loadedOnly)) {
        names(loadedOnly) <- loadedOnly
        pkgDesc <- c(pkgDesc, lapply(loadedOnly, packageDescription))
        z$loadedOnly <- pkgDesc[loadedOnly]
    }
    z$matprod <- as.character(options("matprod"))
    es <- extSoftVersion()
    z$BLAS <- as.character(es["BLAS"])
    z$LAPACK <- La_library()
    l10n <- l10n_info()
    if (!is.null(l10n["system.codepage"])) 
        z$system.codepage <- as.character(l10n["system.codepage"])
    if (!is.null(l10n["codepage"])) 
        z$codepage <- as.character(l10n["codepage"])
    class(z) <- "sessionInfo"
    z
}
<bytecode: 0x000002188d649bf0>
<environment: namespace:utils>

So sir, can i run R CMD build and R CMD check now?

hpages commented 2 years ago

Thank you sir, i just ran getChromInfoFromNCBI("UCB_Xtro_10.0") and it outputs the correct data.

Great. If you're confident that everything looks good, then please proceed with the R CMD build and R CMD check steps.

If that goes as expected, then commit your work and submit a PR. Don't forget to add the Xenopus_tropicalis.R file (with git add Xenopus_tropicalis.R) before you commit.

This is the output of sessionInfo()

You're showing the body of the function, not the output produced by calling the function. I need the latter. Thanks!

Simplecodez commented 2 years ago

Good day sir. This is the output of my sessionInfo():

R version 4.2.1 (2022-06-23 ucrt)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Ubuntu 20.04 x64

Matrix products: default

locale:
[1] LC_COLLATE=English_United States.1252 
[2] LC_CTYPE=English_United States.1252   
[3] LC_MONETARY=English_United States.1252
[4] LC_NUMERIC=C                          
[5] LC_TIME=English_United States.1252    

attached base packages:
[1] stats4    stats     graphics  grDevices utils     datasets  methods  
[8] base     

other attached packages:
[1] GenomeInfoDb_1.33.11 IRanges_2.31.2       S4Vectors_0.35.4    
[4] BiocGenerics_0.43.4 

loaded via a namespace (and not attached):
[1] compiler_4.2.1         GenomeInfoDbData_1.2.9 RCurl_1.98-1.9        
[4] bitops_1.0-7

hpages commented 2 years ago

Hi @Simplecodez ,

Thanks for providing your sessionInfo(). I see that you've installed R for Windows but that it's "running under Ubuntu 20.04 x64". This is a very unconventional setup. I didn't even know it was possible! Did you install an Ubuntu terminal environment on the Windows Subsystem for Linux (WSL), as documented here? I have no experience with the WSL so I hope that your setup will not be problematic.

For what is worth, sessionInfo() usually reports something like this on an Ubuntu system:

> sessionInfo()
R version 4.2.0 Patched (2022-05-04 r82318)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: Ubuntu 22.04.1 LTS

This is what I get on my machine.

I wish you had a more conventional Linux setup. As previously discussed with you in the #outreachy channel on the community-slack (on Oct 10), this is easy to achieve by installing Ubuntu alongside Windows.

Anyways, were you able to run the R CMD build and R CMD check steps successfully?

Thanks

Simplecodez commented 2 years ago

I got this error when I ran R CMD build

Error in loadvignetteBuilder(pkgdir, True) : vignette builder 'knitr' not found

Please how do I fix this

hpages commented 2 years ago

@Simplecodez Did you see my answer on the community-bioc Slack?

Simplecodez commented 2 years ago

Yes sir, I did. I am install knitr package now. Thank you

Simplecodez commented 2 years ago

Good day sir. this is the result of R CMD check GenomeInfoDb_1.33.11.tar.gz

* using log directory 'C:/Users/emma/Desktop/GenomeInfoDb.Rcheck'
* using R version 4.2.1 (2022-06-23 ucrt)
* using platform: x86_64-w64-mingw32 (64-bit)
* using session charset: ISO8859-1
* checking for file 'GenomeInfoDb/DESCRIPTION' ... OK
* this is package 'GenomeInfoDb' version '1.33.11'
* package encoding: UTF-8
* checking package namespace information ... OK
* checking package dependencies ... OK
* checking if this is a source package ... OK
* checking if there is a namespace ... OK
* checking for executable files ... OK
* checking for hidden files and directories ... OK
* checking for portable file names ... OK
* checking whether package 'GenomeInfoDb' can be installed ... OK
* checking installed package size ... OK
* checking package directory ... OK
* checking DESCRIPTION meta-information ... OK
* checking top-level files ... OK
* checking for left-over files ... OK
* checking index information ... OK
* checking package subdirectories ... OK
* checking R files for non-ASCII characters ... OK
* checking R files for syntax errors ... OK
* checking whether the package can be loaded ... OK
* checking whether the package can be loaded with stated dependencies ... OK
* checking whether the package can be unloaded cleanly ... OK
* checking whether the namespace can be loaded with stated dependencies ... OK
* checking whether the namespace can be unloaded cleanly ... OK
* checking loading without being on the library search path ... OK
* checking dependencies in R code ... NOTE
Unexported object imported by a ':::' call: 'utils:::.roman2numeric'
  See the note in ?`:::` about the use of this operator.
* checking S3 generic/method consistency ... OK
* checking replacement functions ... OK
* checking foreign function calls ... OK
* checking R code for possible problems ... OK
* checking Rd files ... OK
* checking Rd metadata ... OK
* checking Rd cross-references ... OK
* checking for missing documentation entries ... OK
* checking for code/documentation mismatches ... OK
* checking Rd \usage sections ... OK
* checking Rd contents ... OK
* checking for unstated dependencies in examples ... OK
* checking files in 'vignettes' ... WARNING
Files in the 'vignettes' directory but no files in 'inst/doc':
  'Accept-organism-for-GenomeInfoDb.Rnw', 'GenomeInfoDb.Rnw'
* checking examples ... OK
* checking for unstated dependencies in 'tests' ... OK
* checking tests ... ERROR
  Running 'run_unitTests.R'
Running the tests in 'tests/run_unitTests.R' failed.
Last 13 lines of output:
  1 Test Suite : 
  GenomeInfoDb RUnit Tests - 21 test functions, 1 error, 0 failures
  ERROR in test_seqlevelsStyle_Seqinfo: Error in scan(file = file, what = what, sep = sep, quote = quote, dec = dec,  : 
    line 1500 did not have 10 elements

  Test files with failing tests

     test_seqlevelsStyle.R 
       test_seqlevelsStyle_Seqinfo 

  Error in BiocGenerics:::testPackage("GenomeInfoDb") : 
    unit tests failed for package GenomeInfoDb
  Calls: <Anonymous> -> <Anonymous>
  Execution halted
* checking for unstated dependencies in vignettes ... OK
* checking package vignettes in 'inst/doc' ... WARNING
Directory 'inst/doc' does not exist.
Package vignettes without corresponding single PDF/HTML:
   'Accept-organism-for-GenomeInfoDb.Rnw'
   'GenomeInfoDb.Rnw'

* checking running R code from vignettes ... NONE
  'Accept-organism-for-GenomeInfoDb.Rnw' using 'UTF-8'... OK
  'GenomeInfoDb.Rnw' using 'UTF-8'... OK
* checking re-building of vignette outputs ... ERROR
Error(s) in re-building vignettes:
--- re-building 'Accept-organism-for-GenomeInfoDb.Rnw' using knitr
Loading required package: BiocGenerics

Attaching package: 'BiocGenerics'

The following objects are masked from 'package:stats':

    IQR, mad, sd, var, xtabs

The following objects are masked from 'package:base':

    Filter, Find, Map, Position, Reduce, anyDuplicated, aperm,
    append, as.data.frame, basename, cbind, colnames, dirname,
    do.call, duplicated, eval, evalq, get, grep, grepl, intersect,
    is.unsorted, lapply, mapply, match, mget, order, paste, pmax,
    pmax.int, pmin, pmin.int, rank, rbind, rownames, sapply,
    setdiff, sort, table, tapply, union, unique, unsplit, which.max,
    which.min

Loading required package: S4Vectors
Loading required package: stats4

Attaching package: 'S4Vectors'

The following objects are masked from 'package:base':

    I, expand.grid, unname

Loading required package: IRanges

Attaching package: 'IRanges'

The following object is masked from 'package:grDevices':

    windows

Error: processing vignette 'Accept-organism-for-GenomeInfoDb.Rnw' failed with diagnostics:
pdflatex is not available
--- failed re-building 'Accept-organism-for-GenomeInfoDb.Rnw'

--- re-building 'GenomeInfoDb.Rnw' using knitr
Loading required package: GenomicFeatures
Loading required package: GenomicRanges
Loading required package: AnnotationDbi
Loading required package: Biobase
Welcome to Bioconductor

    Vignettes contain introductory material; view with
    'browseVignettes()'. To cite Bioconductor, see
    'citation("Biobase")', and for packages 'citation("pkgname")'.

Error: processing vignette 'GenomeInfoDb.Rnw' failed with diagnostics:
pdflatex is not available
--- failed re-building 'GenomeInfoDb.Rnw'

SUMMARY: processing the following files failed:
  'Accept-organism-for-GenomeInfoDb.Rnw' 'GenomeInfoDb.Rnw'

Error: Vignette re-building failed.
Execution halted

* checking PDF version of manual ... WARNING
LaTeX errors when creating PDF version.
This typically indicates Rd problems.
* checking PDF version of manual without index ... ERROR
Re-running with no redirection of stdout/stderr.
* DONE
Status: 3 ERRORs, 3 WARNINGs, 1 NOTE

Simplecodez commented 2 years ago

this is the result of R CMD build GenomeInfoDb

* checking for file 'GenomeInfoDb/DESCRIPTION' ... OK
* preparing 'GenomeInfoDb':
* checking DESCRIPTION meta-information ... OK
* installing the package to build vignettes
* creating vignettes ... ERROR
--- re-building 'Accept-organism-for-GenomeInfoDb.Rnw' using knitr
Loading required package: BiocGenerics

Attaching package: 'BiocGenerics'

The following objects are masked from 'package:stats':

    IQR, mad, sd, var, xtabs

The following objects are masked from 'package:base':

    Filter, Find, Map, Position, Reduce, anyDuplicated, aperm,
    append, as.data.frame, basename, cbind, colnames, dirname,
    do.call, duplicated, eval, evalq, get, grep, grepl, intersect,
    is.unsorted, lapply, mapply, match, mget, order, paste, pmax,
    pmax.int, pmin, pmin.int, rank, rbind, rownames, sapply,
    setdiff, sort, table, tapply, union, unique, unsplit, which.max,
    which.min

Loading required package: S4Vectors
Loading required package: stats4

Attaching package: 'S4Vectors'

The following objects are masked from 'package:base':

    I, expand.grid, unname

Loading required package: IRanges

Attaching package: 'IRanges'

The following object is masked from 'package:grDevices':

    windows

Error: processing vignette 'Accept-organism-for-GenomeInfoDb.Rnw' failed with diagnostics:
pdflatex is not available
--- failed re-building 'Accept-organism-for-GenomeInfoDb.Rnw'

--- re-building 'GenomeInfoDb.Rnw' using knitr
Loading required package: GenomicFeatures
Loading required package: GenomicRanges
Loading required package: AnnotationDbi
Loading required package: Biobase
Welcome to Bioconductor

    Vignettes contain introductory material; view with
    'browseVignettes()'. To cite Bioconductor, see
    'citation("Biobase")', and for packages 'citation("pkgname")'.

Error: processing vignette 'GenomeInfoDb.Rnw' failed with diagnostics:
pdflatex is not available
--- failed re-building 'GenomeInfoDb.Rnw'

SUMMARY: processing the following files failed:
  'Accept-organism-for-GenomeInfoDb.Rnw' 'GenomeInfoDb.Rnw'

Error: Vignette re-building failed.
Execution halted

hpages commented 2 years ago

Hi @Simplecodez ,

It looks like you don't have the pdflatex command on your system. This command is part of TeX/LaTeX.

TeX/LaTeX is required to build the Sweave vignettes contained in the package. Vignettes are documents located in the vignettes/ folder of an R package. There are 2 types of vignettes: Sweave vignettes (extension .Rnw) and R Markdown vignettes (.Rmd extension). Only Sweave vignettes require TeX/LaTeX.

On Ubuntu, and other Debian-like systems, you can install TeX/LaTeX with:

sudo apt-get texlive

Make sure to also install all the following additional Debian packages:

texlive-font-utils
texlive-pstricks
texlive-latex-extra
texlive-fonts-extra
texlive-bibtex-extra
texlive-science
texlive-luatex
texlive-lang-european
texi2html
texinfo
pandoc
pandoc-citeproc
biber

All these Debian packages can be installed with sudo apt-get install <package>

Then try R CMD build GenomeInfoDb again.

Let me know how that goes.

Simplecodez commented 2 years ago

Okay, sir. I will that and get back to you. Thank you

Simplecodez commented 2 years ago

Good day sir. I am sorry for not getting back to you sooner. I have installed texlive and the addition Debian packages. The result below is what I got after running R CMD build GenomeInfoDb. I also noticed that GenomeInfoDb_1.33.11.tar.gz has been created in the folder housing GenomeInfoDb

* checking for file ‘GenomeInfoDb/DESCRIPTION’ ... OK
* preparing ‘GenomeInfoDb’:
* checking DESCRIPTION meta-information ... OK
* installing the package to build vignettes
* creating vignettes ... OK
* checking for LF line-endings in source and make files and shell scripts
* checking for empty or unneeded directories
* building ‘GenomeInfoDb_1.33.11.tar.gz’

hpages commented 2 years ago

@Simplecodez It's great that you were able to install texlive plus all the additional packages! So now it seems that you can successfully R CMD build GenomeInfoDb. That's really good progress.

One thing I notice is that your fork is still at version 1.33.11. However the GenomeInfoDb repository that you forked from is now at version 1.33.15. Please sync your fork. That will bring it at version 1.33.15. Then run R CMD build GenomeInfoDb again. This time it should produce a source tarball with a name that reflects the latest version of the package (i.e. GenomeInfoDb_1.33.15.tar.gz).

Then you can delete the previous tarball and run R CMD check on the new one. If everything works fine, then commit, push, and create a PR (Pull Request). Don't forget to add Xenopus_tropicalis.R to git (with git add Xenopus_tropicalis.R) before you commit. Thanks!

Simplecodez commented 2 years ago

I have synced my fork with the latest version and built and checked the files successfully I have also created a pull request. Thank you sir

Simplecodez commented 2 years ago

Can I start working on task #46?

Simplecodez commented 2 years ago

I have synced my fork with the latest version and built and checked the files successfully I have also created a pull request. Thank you sir

I noticed I made some mistakes in my first commit so I closed the first pull request and opened another one. I hope there is no problem with that?

hpages commented 2 years ago

Can I start working on task #46?

Absolutely. I just assigned you to the task.

I noticed I made some mistakes in my first commit so I closed the first pull request and opened another one. I hope there is no problem with that?

No problem at all. Thanks for the PR. I'm going to take a look at it.

Simplecodez commented 2 years ago

Okay, thank you

Simplecodez commented 2 years ago

Hi, @hpages. I just made the corrections pointed out in the PR I submitted and created and PR. I am anticipating your reply sir. Thank you

Simplecodez commented 2 years ago

I also didn't understand what you meant by indentation with tab and spaces, so I just copied a previous registration and edited it accordingly. I hope everything works fine now. Thank you

hpages commented 2 years ago

Hi @Simplecodez , please do not create a new PR each time you make a correction to a PR. This is not necessary and it makes it difficult to follow. All you need to do is make the requested changes, commit them, and push them. The new commits will automatically be added to the current PR. The problem with closing and creating a new PR each time you make a change is that the new PR doesn't include the discussion that we started in the PR that you closed. We want the entire discussion about the PR to remain in one place.

I also didn't understand what you meant by indentation with tab and spaces

What I meant is that the file contains tabs. No other registration file contains tabs:

hpages@spectre:~/github/Simplecodez/GenomeInfoDb/inst/registered/NCBI_assemblies$ grep -P "\t" *.R
Xenopus_tropicalis.R:        assembly_level="Chromosome",
Xenopus_tropicalis.R:        assembly_level="Chromosome",

They need to be replaced with spaces.

Thanks!

Simplecodez commented 2 years ago

I have made the corrections sir Thank you

hpages commented 2 years ago

Thanks for removing the tabs. More comments in PR #70. Please address so I can merge the PR and we can finally focus on task #46. Thanks again.

hpages commented 2 years ago

Hi @Simplecodez,

I just merged PR #70. :tada:

Congratulation on your first contribution to Bioconductor! Don't forget to record it on Outreachy at https://www.outreachy.org/outreachy-december-2022-internship-round/communities/bioconductor/refactor-the-bsgenomeforge-tools/contributions/.

Let's focus on #46 now. I'll go there and try to answer your questions.

Simplecodez commented 2 years ago

Thank you very much sir. I am really honoured to be a contributor. Thank you for your help and patience.

Simplecodez commented 2 years ago

Hi @Simplecodez,

I just merged PR #70. tada

Congratulation on your first contribution to Bioconductor! Don't forget to record it on Outreachy at https://www.outreachy.org/outreachy-december-2022-internship-round/communities/bioconductor/refactor-the-bsgenomeforge-tools/contributions/.

Please sir how do I get my contribution link? Is it this: ```https://github.com/Bioconductor/GenomeInfoDb/issues/47

Let's focus on #46 now. I'll go there and try to answer your questions.

hpages commented 2 years ago

Yes, I guess you are supposed to use the link to the GitHub issue for the task that you accomplished.

Simplecodez commented 2 years ago

Okay, thank you.

Bioconductor / GenomeInfoDb

Proposed contribution task for Outreachy applicants: Register NCBI assembly UCB_Xtro_10.0 #47