qichao1984 / NCyc

42 stars 22 forks source link

Is number of sequence just number of identified ORFs in the -si file? #24

Closed Rob-murphys closed 2 years ago

Rob-murphys commented 2 years ago

Pretty much as title says. What is th enumber of sequences for the -si file?

Secondly, do you have any suggestions of best practices for the -rs options?

qichao1984 commented 2 years ago

The -si file tells how many sequences you have in the input fastq file for diamond searching. This is to balance sequencing depth differences between different samples. By default, the -rs option can be ignored and uses the minimum sequence number for random sampling. This is recommended and is also the most common way used in metagenomic studies such as amplicon sequencing.

发件人: @.> 发送时间: 2021年12月9日 18:42 收件人: @.> 抄送: @.***> 主题: [qichao1984/NCyc] Is number of sequence just number of identified ORFs in the -si file? (Issue #24)

Pretty much as title says. What is th enumber of sequences for the -si file?

Secondly, do you have any suggestions of best practices for the -rs options?

― You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHubhttps://github.com/qichao1984/NCyc/issues/24, or unsubscribehttps://github.com/notifications/unsubscribe-auth/ABNORGEAZTUQZ5Y2FS5Q2KTUQCBXTANCNFSM5JWD34QQ. Triage notifications on the go with GitHub Mobile for iOShttps://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675 or Androidhttps://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub.

Rob-murphys commented 2 years ago

So the sequence number is just the number of protein sequences in my faa file?

Rob-murphys commented 2 years ago

Also can I specify just one input for for -d