VGP / vgp-assembly

VGP repository for the genome assembly working group
Other
185 stars 51 forks source link

Update Bionano apps #35

Closed msimbirsky closed 5 years ago

msimbirsky commented 5 years ago

Add a command to Bionano apps that removes the colon character added into contig names and refactor the apps to use latest coding best practices

Arkarachai commented 5 years ago

Could you point me to the project that you run the test? Also, the swiss army knife will correct only the leading and trailing N of the scaffold, but not the enzyme recognition site that still mix up in Bionano NCBI output, right?

msimbirsky commented 5 years ago

That's right. The project is here: https://platform.dnanexus.com/projects/FVkzY180G2V1pxZYPKFJKxyx/monitor/job/FVzGpz80G2V4zPP81bxJpp2G

Arkarachai commented 5 years ago

This looks good, @msimbirsky . I just don't have a chance to validate your sed command because your input doesn't have such event. Also, I notice that the unscaffold output from Bionano has 'NNNN' in it which surprised me. Maybe they came from scaff10x. I didn't have access to your original input fasta (file-FVj19Xj0b7ZKGQpBBXy2K251), so I can't tell what is going on. If you didn't use 10x scaffold as input, maybe we should ping Arang and the rest to investigate.

msimbirsky commented 5 years ago

@Arkarachai I do not see any N's in the input fasta file so they must be added by Bionano. Possibly from the restriction enzyme sites.

Arkarachai commented 5 years ago

The NNN is present in the first sequence. dx cat project-FVkzY180G2V1pxZYPKFJKxyx:file-FVzKXk80j2zYYjvq4160x20y | zcat | less

Scaff10x_0_subseq_13154162_13245134_obj I see at least two of them in the If the NNN is added by Bionano, I'm a bit surprised whey they call this unscaffold. Could you grant me access to the input files? file-FVj19Xj0b7ZKGQpBBXy2K251 file-FQ03XKQ0b7ZJBxKb4Bgk9qqj

msimbirsky commented 5 years ago

The inputs are in the project bCatUst1 project which is shared with org-vgp. Are you part of this org? If not you should ask Arang to be added.

msimbirsky commented 5 years ago

I will go ahead and published the new app versions since the unscaffolded issue is unrelated.