Open kokyriakidis opened 5 years ago
Tetrapoda will probably be the most informative for your purposes.
but to keep is simple, for the running of the assembly itself, just stick with the default Euk database. Once you have an assembly agree that Tetrapoda will be good!
I am using the latest docker image. Running the first command runs the pipeline all at once as it says. Should I run the annotation and the evaluation commands, or these are already run with the first command?
running the 1st command runs the entire pipeline, including TransRate and BUSCO (with Euk database). After that, you can annotate or do whatever else you want to. Does this make sense?
Yes, and thank you both very much for this work!
@macmanes Another question! Can I use several samples together? Or I have to concatenate their _1 and _2 fastq files?
Concatenate them all together 1st, but remember the rec for including samples. In general, we strongly recommend that you assembly 1 individual per treatment or group.
On Apr 29, 2019, at 9:32 AM, Konstantinos Kyriakidis notifications@github.com<mailto:notifications@github.com> wrote:
Caution - External Email
@macmaneshttps://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_macmanes&d=DwMCaQ&c=c6MrceVCY5m5A_KAUkrdoA&r=lFmSBplGfvpPNKk6W2tN6-UcUrgjlsdpj7JuHtA6g_Y&m=AHyC2dT4dIzokRaGS8Jg_FFW31KF20Z3R43oKQ7M7aE&s=DuyDM8e5Uw6R1ATiZA2prOsEl3oDicFpKiDearDPN4M&e= Another question! Can I use several samples together? Or I have to concatenate their _1 and _2 fastq files?
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHubhttps://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_macmanes-2Dlab_Oyster-5FRiver-5FProtocol_issues_29-23issuecomment-2D487581074&d=DwMCaQ&c=c6MrceVCY5m5A_KAUkrdoA&r=lFmSBplGfvpPNKk6W2tN6-UcUrgjlsdpj7JuHtA6g_Y&m=AHyC2dT4dIzokRaGS8Jg_FFW31KF20Z3R43oKQ7M7aE&s=sNpMigbx_NxiArihECLaqAPMvREeN7R3Q0CrNMg4SKw&e=, or mute the threadhttps://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_notifications_unsubscribe-2Dauth_AABIHEH4LUYCGBZM6WMG5UDPS32IBANCNFSM4HI2CG3Q&d=DwMCaQ&c=c6MrceVCY5m5A_KAUkrdoA&r=lFmSBplGfvpPNKk6W2tN6-UcUrgjlsdpj7JuHtA6g_Y&m=AHyC2dT4dIzokRaGS8Jg_FFW31KF20Z3R43oKQ7M7aE&s=CgEPsW2qI12OoSEExXKuUC1dVywrOQgnXiwl52TveEU&e=.
@macmanes Could you please explain why is that? Biological replicates wouldn't help assembling lower expressed regions?
@macmanes I have 6 RNAseq libraries (~35M reads each), 3 are normal 3 are not normal. Should I run 3 times the pipeline for the normal and then fuse them with orthofuser and do the same for the other 3 and then fuse the 2 merged? I have read that above 40M reads will be little to no improvement. Using 2 samples 1 from normal and 1 from not normal will it help to recall better transcripts?
How about this - try one assembly using my rec - concatenate 2 individuals together (1 normal and 1 not), and then do another experiment where you concatenate all the reads together. See what you get?
Matt
On Apr 30, 2019, at 5:22 PM, Konstantinos Kyriakidis notifications@github.com<mailto:notifications@github.com> wrote:
Caution - External Email
@macmaneshttps://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_macmanes&d=DwMFaQ&c=c6MrceVCY5m5A_KAUkrdoA&r=lFmSBplGfvpPNKk6W2tN6-UcUrgjlsdpj7JuHtA6g_Y&m=Wl1g_QPk9kl6c6-qBPu4_gLS-OoOVmeMsp5OO0WRZ6U&s=zl6nFp18rcuptBiioGMhCvJF8mlX2YYsb-C6oUr2xUg&e= I have 6 RNAseq libraries (~35M reads each), 3 are normal 3 are not normal. Should I run 3 times the pipeline for the normal and then fuse them with orthofuser and do the same for the other 3 and then do the same for the 2 merged?
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHubhttps://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_macmanes-2Dlab_Oyster-5FRiver-5FProtocol_issues_29-23issuecomment-2D488120256&d=DwMFaQ&c=c6MrceVCY5m5A_KAUkrdoA&r=lFmSBplGfvpPNKk6W2tN6-UcUrgjlsdpj7JuHtA6g_Y&m=Wl1g_QPk9kl6c6-qBPu4_gLS-OoOVmeMsp5OO0WRZ6U&s=wVMDeZS6uLfC8hLtT4dmHG9S2ok9gpYsSNLDXtkhWVs&e=, or mute the threadhttps://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_notifications_unsubscribe-2Dauth_AABIHEDSRV7X7W55ZIPQ5DLPTC2ALANCNFSM4HI2CG3Q&d=DwMFaQ&c=c6MrceVCY5m5A_KAUkrdoA&r=lFmSBplGfvpPNKk6W2tN6-UcUrgjlsdpj7JuHtA6g_Y&m=Wl1g_QPk9kl6c6-qBPu4_gLS-OoOVmeMsp5OO0WRZ6U&s=1nx0TlrcD0SX1T55B708ErozrJAeMyy4PWzgquNwuy4&e=.
@macmanes Thank you for your reply! These 6 samples are from 3 pairs of siblings. Do you think I should choose 1 normal and it's not normal sibling? or chose one from another family?
I think I’d try for choosing samples from within a family if possible, but not knowing how different families are, it’s hard to say.
Matt
On Apr 30, 2019, at 6:25 PM, Konstantinos Kyriakidis notifications@github.com<mailto:notifications@github.com> wrote:
Caution - External Email
Thank you for your reply! These 6 samples are from 3 pairs of siblings. Do you think I should choose 1 normal and it's not normal sibling? or chose one from another family?
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHubhttps://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_macmanes-2Dlab_Oyster-5FRiver-5FProtocol_issues_29-23issuecomment-2D488137700&d=DwMCaQ&c=c6MrceVCY5m5A_KAUkrdoA&r=lFmSBplGfvpPNKk6W2tN6-UcUrgjlsdpj7JuHtA6g_Y&m=dcdOUEmL6_WDjqdVbHiixM4cSLpMjpBGB6-mIk8FunI&s=srCxAJwwLJKcvFkuOa9e8M_XP8jMXNOaiKCAqjbPy-E&e=, or mute the threadhttps://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_notifications_unsubscribe-2Dauth_AABIHECXUU7UPSKVQVD3X7LPTDBMDANCNFSM4HI2CG3Q&d=DwMCaQ&c=c6MrceVCY5m5A_KAUkrdoA&r=lFmSBplGfvpPNKk6W2tN6-UcUrgjlsdpj7JuHtA6g_Y&m=dcdOUEmL6_WDjqdVbHiixM4cSLpMjpBGB6-mIk8FunI&s=ejGcMLYjQvX6r8wM4AOhfI3sipoKF2Qs5ZdtaeXCSWU&e=.
Hello, do I have to choose which datasets to include, or could I use them all? I am running an analysis on Chelonia Mydas.
The lineage is
Should I use Tetrapoda dataset? Should I use Tetrapoda AND eukaryota? Or should I use more?