brouwern / compbio2021

Assignments for Computational Biology Fall 2021 at the University of Pittsburgh
0 stars 3 forks source link

Why does it need to be in a string set? #35

Open drh85 opened 3 years ago

drh85 commented 3 years ago

https://github.com/brouwern/compbio2021/blob/4fc22d26cd66119a177fd9572bd409ffa989216d/KEY-MSA-walkthrough-shroom.Rmd#L582

code by Nathan Brouwer, text by David Hall

brouwern commented 3 years ago

not something that will be tested

msa() was written so that its input has to be a stringset. i haven't dug into it too much but I think a string set is just a way of organizing sequence data in an efficient way for other programs to use. sequences data can get very large and I think there's an underlying architecture beyond basic R data structures that makes it more memory efficient

on my to do list is to write a version of msa() that takes a list of fasta files as its input and runs the stringset conversion internally within the function. this should be easy, and hopefully the author's of the package would integrate the change into the package

On Mon, Oct 4, 2021 at 11:01 AM drh85 @.***> wrote:

https://github.com/brouwern/compbio2021/blob/4fc22d26cd66119a177fd9572bd409ffa989216d/KEY-MSA-walkthrough-shroom.Rmd#L582

code by Nathan Brouwer, text by David Hall

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/brouwern/compbio2021/issues/35, or unsubscribe https://github.com/notifications/unsubscribe-auth/AB32NE57KXUAPNTIGFBEN3TUFG6TPANCNFSM5FJSSMKQ .

--

Nathan L. Brouwer, PhD

@.***

Lecturer

Department of Biological Sciences https://www.biology.pitt.edu/

University of Pittsburgh

Biostatistics course: brouwern.github.io/BIOSC_1120/index.html

Research Associate

National Aviary, Dept. of Conservation & Field Research https://www.aviary.org/conservation

R code: github.com/brouwern

R tweets: @lobrowR https://twitter.com/lobrowR