Closed Rob-murphys closed 2 years ago
The -si file tells how many sequences you have in the input fastq file for diamond searching. This is to balance sequencing depth differences between different samples. By default, the -rs option can be ignored and uses the minimum sequence number for random sampling. This is recommended and is also the most common way used in metagenomic studies such as amplicon sequencing.
发件人: @.> 发送时间: 2021年12月9日 18:42 收件人: @.> 抄送: @.***> 主题: [qichao1984/NCyc] Is number of sequence just number of identified ORFs in the -si file? (Issue #24)
Pretty much as title says. What is th enumber of sequences for the -si file?
Secondly, do you have any suggestions of best practices for the -rs options?
― You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHubhttps://github.com/qichao1984/NCyc/issues/24, or unsubscribehttps://github.com/notifications/unsubscribe-auth/ABNORGEAZTUQZ5Y2FS5Q2KTUQCBXTANCNFSM5JWD34QQ. Triage notifications on the go with GitHub Mobile for iOShttps://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675 or Androidhttps://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub.
So the sequence number is just the number of protein sequences in my faa file?
Also can I specify just one input for for -d
Pretty much as title says. What is th enumber of sequences for the
-si
file?Secondly, do you have any suggestions of best practices for the -rs options?