Phelimb / atlas

atlas
MIT License
5 stars 4 forks source link

Change threshold for outputting paths in `walk` #21

Open Phelimb opened 8 years ago

Phelimb commented 8 years ago
        if v["len_dna"] == N + 1 and v["dna"][-1] == "*":
            keep_paths[k] = v

is too strict a criteria. this will not output any paths with a single base deletion (frame shift).

Phelimb commented 8 years ago

Inserting a stop codon often returns nothing. When walk does output path the translation is broken (as expected).

@ronald-jaepel could you possibly point me to an example where this works and one where it doesn't (or several).

Phelimb commented 8 years ago

Inserting a 10bp deletion rarely results in a full assembly of a gene in the database. @ronald-jaepel could you also point me to an example of this?

ronald-jaepel commented 8 years ago

Data is in /data1/projects/ronald_jaepel/atlas_test/Simulation_Products/simulated_reads/$TIMESTAMP/ and /data1/projects/ronald_jaepel/atlas_test/Simulation_Products/generated_genomes/$TIMESTAMP/

Jsons from atlas are in /data1/projects/ronald_jaepel/atlas_test/JSONS/$TIMESTAMP/

for the following experiments: TIMESTAMP="2016_08_22_1446/" #first deletion of 10 with all families still included TIMESTAMP="2016_08_23_1156/" #last deletion of 20 without families cml - sul - aac - aad - B TIMESTAMP="2016_08_23_1500/" #insertion of stop-codon

In each case ecoli_all_families was the initial simulation with one allele of each family and 1 000 000 reads simulated. All others are subsamples of those reads (10 000 - 840 000).