Closed bjclavijo closed 11 years ago
Thanks for this, Bernardo! Do you have assembly stats for this (N50, total size etc) please? I am away from my desk at the moment. Was this based on 4 Miseq runs?
best wishes,
Richard
On 14 May 2013, at 01:46, Bernardo Clavijo wrote:
This pull contains the Nornex preliminary first pass draft assembly from TGAC for the Tree35. Read files to be uploaded on the FTP and paths updated.
You can merge this Pull Request by running
git pull https://github.com/bjclavijo/data master Or view, comment on, or merge it at:
https://github.com/ash-dieback-crowdsource/data/pull/5
Commit Summary
Added Nornex tree35 assembly by TGAC File Changes
A ash_dieback/fraxinus_excelsior/tree35/assemblies/gDNA/Fraxinus_excelsior_Nornex_s1v1/Fraxinus_excelsior_Nornex_s1v1.tar.gz (0) A ash_dieback/fraxinus_excelsior/tree35/assemblies/gDNA/Fraxinus_excelsior_Nornex_s1v1/assembly.info (22) A ash_dieback/fraxinus_excelsior/tree35/reads/gDNA/read_set_1/read_set.info (21) A ash_dieback/fraxinus_excelsior/tree35/reads/gDNA/read_set_2/read_set.info (21) A ash_dieback/fraxinus_excelsior/tree35/reads/gDNA/read_set_3/read_set.info (21) A ash_dieback/fraxinus_excelsior/tree35/reads/gDNA/read_set_4/read_set.info (21) A ash_dieback/fraxinus_excelsior/tree35/strain.info (34) Patch Links:
https://github.com/ash-dieback-crowdsource/data/pull/5.patch https://github.com/ash-dieback-crowdsource/data/pull/5.diff
Dr Richard Buggs | Senior Lecturer | School of Biological and Chemical Sciences, Queen Mary University of London, E1 4NS, United Kingdom | email: r.buggs@qmul.ac.uk | website: http://www.sbcs.qmul.ac.uk/staff/richardbuggs.html | office: +44(0)207 882 3058 | mobile: +44(0)772 992 0401 | twitter: @RJABuggs
Yes it is based on the 4 runs available on the FTP, 2 are MiSeq and 2 are HiSeq.
Stats as from abyss-fac are:
I am aware of duplication/ incorrect copy numbers on the assembly due to the heterozuygosity, but importantly I think most of the unique content is assembled to a relatively good standard. That means that if a gene IS on the genome, it will be assembled, but might appear more times than it should.
This is a starting point, I will obviously keep working on this, but this really was a first-pass assembly that I do as part of the data analysis on the runs more than nothing. Little to no tweaking. We releasing it because we think it's good enough for a lot of analysis and our computing platform allowed us to do it quickly while some other people might not have the resources.
If you have specific concerns or questions (or anyone else has) please write me to my tgac email and we can work it out together.
Cheers,
bj
On 15 May 2013, at 12:09, RichardBuggs wrote:
Thanks for this, Bernardo! Do you have assembly stats for this (N50, total size etc) please? I am away from my desk at the moment. Was this based on 4 Miseq runs?
best wishes,
Richard
On 14 May 2013, at 01:46, Bernardo Clavijo wrote:
This pull contains the Nornex preliminary first pass draft assembly from TGAC for the Tree35. Read files to be uploaded on the FTP and paths updated.
You can merge this Pull Request by running
git pull https://github.com/bjclavijo/data master Or view, comment on, or merge it at:
https://github.com/ash-dieback-crowdsource/data/pull/5
Commit Summary
Added Nornex tree35 assembly by TGAC File Changes
A ash_dieback/fraxinus_excelsior/tree35/assemblies/gDNA/Fraxinus_excelsior_Nornex_s1v1/Fraxinus_excelsior_Nornex_s1v1.tar.gz (0) A ash_dieback/fraxinus_excelsior/tree35/assemblies/gDNA/Fraxinus_excelsior_Nornex_s1v1/assembly.info (22) A ash_dieback/fraxinus_excelsior/tree35/reads/gDNA/read_set_1/read_set.info (21) A ash_dieback/fraxinus_excelsior/tree35/reads/gDNA/read_set_2/read_set.info (21) A ash_dieback/fraxinus_excelsior/tree35/reads/gDNA/read_set_3/read_set.info (21) A ash_dieback/fraxinus_excelsior/tree35/reads/gDNA/read_set_4/read_set.info (21) A ash_dieback/fraxinus_excelsior/tree35/strain.info (34) Patch Links:
https://github.com/ash-dieback-crowdsource/data/pull/5.patch https://github.com/ash-dieback-crowdsource/data/pull/5.diff
Dr Richard Buggs | Senior Lecturer | School of Biological and Chemical Sciences, Queen Mary University of London, E1 4NS, United Kingdom | email: r.buggs@qmul.ac.uk | website: http://www.sbcs.qmul.ac.uk/staff/richardbuggs.html | office: +44(0)207 882 3058 | mobile: +44(0)772 992 0401 | twitter: @RJABuggs — Reply to this email directly or view it on GitHub.
Hi Bernado,
We feel pretty cautious about our 454 assembly so far, but like you felt it was worth releasing asap.
I think perhaps you forgot to paste the stats into your email (see below)...
Have you run the assembly through the CEGMA pipeline to see how many core eukaryote genes you hit?
best wishes
Richard
On 15 May 2013, at 13:27, Bernardo Clavijo wrote:
Yes it is based on the 4 runs available on the FTP, 2 are MiSeq and 2 are HiSeq.
Stats as from abyss-fac are:
I am aware of duplication/ incorrect copy numbers on the assembly due to the heterozuygosity, but importantly I think most of the unique content is assembled to a relatively good standard. That means that if a gene IS on the genome, it will be assembled, but might appear more times than it should.
This is a starting point, I will obviously keep working on this, but this really was a first-pass assembly that I do as part of the data analysis on the runs more than nothing. Little to no tweaking. We releasing it because we think it's good enough for a lot of analysis and our computing platform allowed us to do it quickly while some other people might not have the resources.
If you have specific concerns or questions (or anyone else has) please write me to my tgac email and we can work it out together.
Cheers,
bj
On 15 May 2013, at 12:09, RichardBuggs wrote:
Thanks for this, Bernardo! Do you have assembly stats for this (N50, total size etc) please? I am away from my desk at the moment. Was this based on 4 Miseq runs?
best wishes,
Richard
On 14 May 2013, at 01:46, Bernardo Clavijo wrote:
This pull contains the Nornex preliminary first pass draft assembly from TGAC for the Tree35. Read files to be uploaded on the FTP and paths updated.
You can merge this Pull Request by running
git pull https://github.com/bjclavijo/data master Or view, comment on, or merge it at:
https://github.com/ash-dieback-crowdsource/data/pull/5
Commit Summary
Added Nornex tree35 assembly by TGAC File Changes
A ash_dieback/fraxinus_excelsior/tree35/assemblies/gDNA/Fraxinus_excelsior_Nornex_s1v1/Fraxinus_excelsior_Nornex_s1v1.tar.gz (0) A ash_dieback/fraxinus_excelsior/tree35/assemblies/gDNA/Fraxinus_excelsior_Nornex_s1v1/assembly.info (22) A ash_dieback/fraxinus_excelsior/tree35/reads/gDNA/read_set_1/read_set.info (21) A ash_dieback/fraxinus_excelsior/tree35/reads/gDNA/read_set_2/read_set.info (21) A ash_dieback/fraxinus_excelsior/tree35/reads/gDNA/read_set_3/read_set.info (21) A ash_dieback/fraxinus_excelsior/tree35/reads/gDNA/read_set_4/read_set.info (21) A ash_dieback/fraxinus_excelsior/tree35/strain.info (34) Patch Links:
https://github.com/ash-dieback-crowdsource/data/pull/5.patch https://github.com/ash-dieback-crowdsource/data/pull/5.diff
Dr Richard Buggs | Senior Lecturer | School of Biological and Chemical Sciences, Queen Mary University of London, E1 4NS, United Kingdom | email: r.buggs@qmul.ac.uk | website: http://www.sbcs.qmul.ac.uk/staff/richardbuggs.html | office: +44(0)207 882 3058 | mobile: +44(0)772 992 0401 | twitter: @RJABuggs — Reply to this email directly or view it on GitHub.
— Reply to this email directly or view it on GitHub.
Dr Richard Buggs | Senior Lecturer | School of Biological and Chemical Sciences, Queen Mary University of London, E1 4NS, United Kingdom | email: r.buggs@qmul.ac.uk | website: http://www.sbcs.qmul.ac.uk/staff/richardbuggs.html | office: +44(0)207 882 3058 | mobile: +44(0)772 992 0401 | twitter: @RJABuggs
Hi Richard, sorry, stats where attached as image, i'm pasting on txt here, hope the fonts don't mess up the displaying of them:
409503 387267 62063 200 2801 5911 11876 123139 1.282e9 Fraxinus_excelsior_Nornex_s1v1-contigs.fa
249580 249580 48860 392 3526 7138 15616 315237 1.285e9 Fraxinus_excelsior_Nornex_s1v1-scaffolds.fa
Haven't got time to put it through CEGMA yet, releasing soon was priority. But I'll do (or somebody else... this is crowdsourcing after all :P ). Having said that, I think on gene presence we'll be ok, but will have n-plications (typical heterozygous assembly results, and as you see the total bp is higher tha genome size, so expected).
Cheers,
bj
On 15 May 2013, at 15:25, RichardBuggs wrote:
Hi Bernado,
We feel pretty cautious about our 454 assembly so far, but like you felt it was worth releasing asap.
I think perhaps you forgot to paste the stats into your email (see below)...
Have you run the assembly through the CEGMA pipeline to see how many core eukaryote genes you hit?
best wishes
Richard
On 15 May 2013, at 13:27, Bernardo Clavijo wrote:
Yes it is based on the 4 runs available on the FTP, 2 are MiSeq and 2 are HiSeq.
Stats as from abyss-fac are:
I am aware of duplication/ incorrect copy numbers on the assembly due to the heterozuygosity, but importantly I think most of the unique content is assembled to a relatively good standard. That means that if a gene IS on the genome, it will be assembled, but might appear more times than it should.
This is a starting point, I will obviously keep working on this, but this really was a first-pass assembly that I do as part of the data analysis on the runs more than nothing. Little to no tweaking. We releasing it because we think it's good enough for a lot of analysis and our computing platform allowed us to do it quickly while some other people might not have the resources.
If you have specific concerns or questions (or anyone else has) please write me to my tgac email and we can work it out together.
Cheers,
bj
On 15 May 2013, at 12:09, RichardBuggs wrote:
Thanks for this, Bernardo! Do you have assembly stats for this (N50, total size etc) please? I am away from my desk at the moment. Was this based on 4 Miseq runs?
best wishes,
Richard
On 14 May 2013, at 01:46, Bernardo Clavijo wrote:
This pull contains the Nornex preliminary first pass draft assembly from TGAC for the Tree35. Read files to be uploaded on the FTP and paths updated.
You can merge this Pull Request by running
git pull https://github.com/bjclavijo/data master Or view, comment on, or merge it at:
https://github.com/ash-dieback-crowdsource/data/pull/5
Commit Summary
Added Nornex tree35 assembly by TGAC File Changes
A ash_dieback/fraxinus_excelsior/tree35/assemblies/gDNA/Fraxinus_excelsior_Nornex_s1v1/Fraxinus_excelsior_Nornex_s1v1.tar.gz (0) A ash_dieback/fraxinus_excelsior/tree35/assemblies/gDNA/Fraxinus_excelsior_Nornex_s1v1/assembly.info (22) A ash_dieback/fraxinus_excelsior/tree35/reads/gDNA/read_set_1/read_set.info (21) A ash_dieback/fraxinus_excelsior/tree35/reads/gDNA/read_set_2/read_set.info (21) A ash_dieback/fraxinus_excelsior/tree35/reads/gDNA/read_set_3/read_set.info (21) A ash_dieback/fraxinus_excelsior/tree35/reads/gDNA/read_set_4/read_set.info (21) A ash_dieback/fraxinus_excelsior/tree35/strain.info (34) Patch Links:
https://github.com/ash-dieback-crowdsource/data/pull/5.patch https://github.com/ash-dieback-crowdsource/data/pull/5.diff
Dr Richard Buggs | Senior Lecturer | School of Biological and Chemical Sciences, Queen Mary University of London, E1 4NS, United Kingdom | email: r.buggs@qmul.ac.uk | website: http://www.sbcs.qmul.ac.uk/staff/richardbuggs.html | office: +44(0)207 882 3058 | mobile: +44(0)772 992 0401 | twitter: @RJABuggs — Reply to this email directly or view it on GitHub.
— Reply to this email directly or view it on GitHub.
Dr Richard Buggs | Senior Lecturer | School of Biological and Chemical Sciences, Queen Mary University of London, E1 4NS, United Kingdom | email: r.buggs@qmul.ac.uk | website: http://www.sbcs.qmul.ac.uk/staff/richardbuggs.html | office: +44(0)207 882 3058 | mobile: +44(0)772 992 0401 | twitter: @RJABuggs — Reply to this email directly or view it on GitHub.
Hi Bernardo, Im going to put your stats on the wiki page for this entry. Just so you know you can use Markdown to format comments nicely for GitHub
I have added CEGMA analysis results to the wiki. These are a bit better than the 454 assembly at ashgenome.org.
Richard
On 16 May 2013, at 13:33, Dan MacLean wrote:
Hi BErnardo, Im going to put your stats on the wiki page for this entry. Just so you know you can use Markdown to format comments nicely for GitHub
— Reply to this email directly or view it on GitHub.
Dr Richard Buggs | Senior Lecturer | School of Biological and Chemical Sciences, Queen Mary University of London, E1 4NS, United Kingdom | email: r.buggs@qmul.ac.uk | website: http://www.sbcs.qmul.ac.uk/staff/richardbuggs.html | office: +44(0)207 882 3058 | mobile: +44(0)772 992 0401 | twitter: @RJABuggs
This pull contains the Nornex preliminary first pass draft assembly from TGAC for the Tree35. Read files to be uploaded on the FTP and paths updated.