Bioconductor / Biostrings

Efficient manipulation of biological strings
https://bioconductor.org/packages/Biostrings
57 stars 16 forks source link

confused about reading in a single long fasta subject sequence for pattern matching #89

Closed tchrisboles closed 1 year ago

tchrisboles commented 1 year ago

I want to search for crRNA matches along a single subject sequence. I thought it was required that I use readDNAStringSet() to import sequences from fasta files. However, when I run matchPDict using the imported subject I get an error that says I should have used vmatchPDict() (which is not ready). How should I import my single large subject file (~5Mb)?

image

hpages commented 1 year ago

Using a screenshot when you can simply copy-paste the content of your terminal in the issue is really not a good idea. This will prevent others from finding this issue when they search issues by function name (e.g. matchPDict). It will also prevent the people who are trying to help you from copy-past'ing your code when they try to reproduce the problem on their own machine.

As for your problem: Please make sure to consult the man page for matchPDict(). There it clearly says that matchPDict() expects the subject to be a DNAString object. In your case, hap is a DNAStringSet object of length 1. DNAStringSet objects are list-like objects where the list elements are DNAString objects. To extract the first and sole list element from hap, just do hap[[1]]. This will return the first and sole list element in hap as a DNAString object.

Hope this helps, H.

hpages commented 1 year ago

Finally note that questions about how to use the software are generally better asked on our support site here. GitHub issues should preferrably be used for reporting bugs and other issues with the software, or for requesting new features. Thanks!

H.

tchrisboles commented 1 year ago

Thanks for your reply.

Sorry for the bad etiquette.

Best regards,

Chris Boles, Ph.D, Chief Scientific Officer Sage Science, Inc., 500 Cummings Center, Suite 2400, Beverly, MA 01915 Direct Office: +1-978-522-6284, Mobile: +1-781-856-2165 Company: +1-978-922-1832 @.***

This e-mail may contain information that is confidential or privileged. If you are not the intended recipient, do not read, copy or distribute the e-mail or any attachments. Instead, please notify the sender and delete the e-mail and any attachments. Thank you.

On Tue, Dec 13, 2022 at 6:43 PM Hervé Pagès @.***> wrote:

Using a screenshot when you can simply copy-paste the content of your terminal in the issue is really not a good idea. This will prevent others from finding this issue when they search issues by function name (e.g. matchPDict). It will also prevent the people who are trying to help you from copy-past'ing your code when they try to reproduce the problem on their own machine.

As for your problem: Please make sure to consult the man page for matchPDict(). There it clearly says that matchPDict() expects the subject to be a DNAString object. In your case, hap is a DNAStringSet object of length 1. DNAStringSet objects are list-like objects where the list elements are DNAString objects. To extract the first and sole list element from hap, just do hap[[1]]. This will return the first and sole list element in hap as a DNAString object.

Hope this helps, H.

— Reply to this email directly, view it on GitHub https://github.com/Bioconductor/Biostrings/issues/89#issuecomment-1350072815, or unsubscribe https://github.com/notifications/unsubscribe-auth/AH7SXXL3MNKJSK6Z22MGZS3WNEC23ANCNFSM6AAAAAASZ2U5NA . You are receiving this because you authored the thread.Message ID: @.***>

tchrisboles commented 1 year ago

Thanks again. Again, sorry for the bad etiquette.

Best regards,

Chris Boles, Ph.D, Chief Scientific Officer Sage Science, Inc., 500 Cummings Center, Suite 2400, Beverly, MA 01915 Direct Office: +1-978-522-6284, Mobile: +1-781-856-2165 Company: +1-978-922-1832 @.***

This e-mail may contain information that is confidential or privileged. If you are not the intended recipient, do not read, copy or distribute the e-mail or any attachments. Instead, please notify the sender and delete the e-mail and any attachments. Thank you.

On Tue, Dec 13, 2022 at 6:47 PM Hervé Pagès @.***> wrote:

Finally note that questions about how to use the software are generally better asked on our support site here https://support.bioconductor.org. GitHub issues should preferrably be used for reporting bugs and other issues with the software, or for requesting new features. Thanks!

H.

— Reply to this email directly, view it on GitHub https://github.com/Bioconductor/Biostrings/issues/89#issuecomment-1350080583, or unsubscribe https://github.com/notifications/unsubscribe-auth/AH7SXXLY7G3U4EGTAULAP7LWNEDJJANCNFSM6AAAAAASZ2U5NA . You are receiving this because you authored the thread.Message ID: @.***>