Closed nservant closed 3 years ago
Hi Nicolas,
I have now tried to change the behaviour to print out all chromosomes, even if they were not covered by SNPs. Could you give it a whirl and see if it appears to do what you wanted? Addressed here: 9a81c16576a88a7d0e83b4c19a9d3e6b3d9ed4c9
HI @FelixKrueged, I run the new version.
SNPsplit_genome_preparation --strain CAST_EiJ --reference_genome genome --vcf_file mgp.v5.merged.snps_all.dbSNP142.vcf
Two things :
Using the following chromosomes (...)
Is it expected ? Otherwise, I do have all chromosomes as expected in the results folder. Cheers
Hmm, the N-masking seems to work fine if you specify --full_sequence
as well, I'll take another look tomorrow.
Right, it was ... - a scoping issue. It should work now, could you try cloning the dev version and try again? Addressed here 0e4431e98645058c69a8503ddb0ce324b26b5b00.
Yes. Much better now !
Summary 20668547 Ns were newly introduced into the N-masked genome for strain CAST_EiJ in total
Awesome, I'll leave this open for a few more days to give you some time to test. It will then find its way into the next release.
Hi, When I run the SNPsplit_genome_preparation script on the complete Mouse genome (base chromosomes + all scaffolds/fixes), with --no_nmasking, the full_sequence output contains only the base chromosome.
My genome reference comes from ;
Command line ;
Output ;
I think it would be good to export all chromosomes, even if there have no SNPs. From ENSEMBLE ;
Fix patches: provide improved sequence for known assembly errors. These patches will be incorporated into the primary assembly in the next major assembly release. They are coloured green in the Chromosome summary page and Region in detail page. They are improvements on the primary assembly and should be used preferentially over the primary assembly.
Thanks @FelixKrueger ! Nicolas