Closed manulera closed 1 month ago
Hi all,
I have a quick question on the pydna Dseqrecord page: is there no built-in method to remove a feature from a, say .gb file? Is the best way of going about it to use list comprehension?
Thanks!
Hi there @JeffXiePL, the list comprehension where you use an if statement is probably the best way to filter a list. There are similar ways, but they are not better
from Bio.SeqRecord import SeqRecord
from Bio.SeqFeature import SeqFeature, SimpleLocation
# We create a seqrecord with two features
f1 = SeqFeature(SimpleLocation(1, 5), type='CDS', id='f1')
f2 = SeqFeature(SimpleLocation(8, 15), type='misc_feature', id='f2')
seqr = SeqRecord('AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA', features=[f1, f2])
# We filter out a feature
seqr.features = [f for f in seqr.features if f.type != 'misc_feature']
print(seqr.features)
Hello all,
I wanted to ask why is there two Contig objects after using assembly_circ on an Assembly object? I couldn't find much details on the documentations.
Peilun Xie
Hi @JeffXiePL, I think you probably mean assemble_circular
.
A Contig
is a subclass of Dseqrecord
with some extra methods that allow you to see how it was assembled. When you call assemble_circular
, in principle you will be getting all possible circular assemblies that can be produced given the algorithm
that you passed as Contig
s. A set of fragments may be assembled in different ways. If you share an example you don't understand I can explain a bit better.
Note however that the current implementation sometimes gives unexpected results, given how the possible assemblies are computed. This will be fixed once I merge the new implementation.
In the example below, where the homology region of a gibson assembly ACGTAATG
appears in several fragments, assemble_circular
returns 4 contigs each representing a fragment circularised, in forward and reverse orientation. All this to say that if you are getting results that you think don't make sense, it may be because of that. In any case, feel free to share an example.
from pydna.assembly import Assembly
from pydna.dseqrecord import Dseqrecord
a = Dseqrecord("ACGTAATGaccACGTAATG")
b = Dseqrecord("ACGTAATGcgcACGTAATG")
assembly = Assembly((a, b), limit=8)
for out in assembly.assemble_circular():
print(out.seq)
More info on what gives this behaviour (no need to go into it, but putting here for documentation purposes)
https://github.com/BjornFJohansson/pydna/issues/166 https://github.com/BjornFJohansson/pydna/issues/200 https://github.com/BjornFJohansson/pydna/issues/192
cc @BjornFJohansson @hiyama341 @dgruano.
@JeffXiePL is going to work on the pydna documentation in the next weeks, and I made a list of what I think should be covered. The idea is to have in the style of a cookbook (how to achieve a task) rather than library documentation (what every class method does, etc.). I know there is a bit of that in the cookbook folder, but we would like to cover a bit more.
Below is the link of the guidelines for the documentation, feel free to edit / add things within reason for @JeffXiePL to cover.
https://docs.google.com/document/d/19sRRAMIHqn0rg-oHSdqIR6DxTIHYo2uj15nRdjq8D5Q/edit?usp=drive_link