Closed gwct closed 8 months ago
Neat doesn't have a flag for this (though potentially this is a feature we could add), but you can do it one of two ways. the First would be to generate a custom mutation model with an existing vcf (just filter out the indels first, so it only sees snps). This will result in NEAT setting indel percentage to 0. There is also a pre-built model "models/MutModel_NA12878_noIndel.pickle.gz" which should produce a no-indel result, but it is geared towards human data, so that may not fit your use case.
-Josh
From: Gregg Thomas @.> Sent: Monday, October 2, 2023 3:00 PM To: ncsa/NEAT @.> Cc: Subscribed @.***> Subject: [ncsa/NEAT] Easy way to specify not to simulate indels in the mutation model? (Issue #85)
Hello, I'm wondering if there is some way that I'm not seeing to tell NEAT to only insert SNPs and not indels while it is simulating reads? I'm using v3.2.
Thanks!
— Reply to this email directly, view it on GitHubhttps://urldefense.com/v3/__https://github.com/ncsa/NEAT/issues/85__;!!DZ3fjg!8Gz8Y1yu7rHyEzDOkrwXmsuFrSliYybjk35iJelsoG0J6ELpTQnUiBA1aNN0KJdAkiqjOEk6Th3vFk-huV8hc98ChfNnYg$, or unsubscribehttps://urldefense.com/v3/__https://github.com/notifications/unsubscribe-auth/AGMI724WWJGAUYKUGPYFO33X5MMPHAVCNFSM6AAAAAA5P5SXDKVHI2DSMVQWIX3LMV43ASLTON2WKOZRHEZDENJYHEYDMMI__;!!DZ3fjg!8Gz8Y1yu7rHyEzDOkrwXmsuFrSliYybjk35iJelsoG0J6ELpTQnUiBA1aNN0KJdAkiqjOEk6Th3vFk-huV8hc9_tBVZktA$. You are receiving this because you are subscribed to this thread.Message ID: @.***>
Ok, thanks. I think the human-based model should be fine -- I just need something without indels. I do agree that a --noindel
flag would be really convenient.
Otherwise, are there examples of model files somewhere such that I could create my own if I wanted? I see the pickled ones, but I don't know of an easy way to view those.
Thanks again!
in the Utilities folder, there is a utility called gen_mut_model.py. The idea is that it will read in your real vcf data (you'll need the reference as well) and then generate a mutation model pickle file that can be used as input to NEAT using the -m flag (lower case). You can use the no-indel model in that way.
example:
python gen_mut_model.py -m my_variants.vcf -r my_ref.fa -o models/no-indel
python gen_reads.py -m models/no-indel.pickle -r my_ref.fa -c 10
Pickle is just a way to store a python object in a file, so the files themselves are either just lists of numbers or a class instance, but basically indecipherable without knowing the details of the code.
From: Gregg Thomas @.> Sent: Tuesday, October 3, 2023 2:54 PM To: ncsa/NEAT @.> Cc: Allen, Josh @.>; Comment @.> Subject: Re: [ncsa/NEAT] Easy way to specify not to simulate indels in the mutation model? (Issue #85)
Ok, thanks. I think the human-based model should be fine -- I just need something without indels. I do agree that a --noindel flag would be really convenient.
Otherwise, are there examples of model files somewhere such that I could create my own if I wanted? I see the pickled ones, but I don't know of an easy way to view those.
Thanks again!
— Reply to this email directly, view it on GitHubhttps://urldefense.com/v3/__https://github.com/ncsa/NEAT/issues/85*issuecomment-1745629729__;Iw!!DZ3fjg!4gXcqf3TyzP3n4Ol3Kus4uDdNJ3u86KUIMSAL9OpyViSre1cNBMSW6bwJ8qSBukmRuBbNaYJ-LQdrS6Zp8871g0zMrTACQ$, or unsubscribehttps://urldefense.com/v3/__https://github.com/notifications/unsubscribe-auth/AGMI72YRQIOVXKSF6BIPLKLX5RUOFAVCNFSM6AAAAAA5P5SXDKVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTONBVGYZDSNZSHE__;!!DZ3fjg!4gXcqf3TyzP3n4Ol3Kus4uDdNJ3u86KUIMSAL9OpyViSre1cNBMSW6bwJ8qSBukmRuBbNaYJ-LQdrS6Zp8871g12u59fqg$. You are receiving this because you commented.Message ID: @.***>
Got it, thanks. I was hoping there'd be some plain text files with the mutation distributions that I could look at, but if the data structures are more complex that's understandable.
Thanks!
There is a plot mut model function that can do some visualizations, but it's untested and I can't even say for sure if it works. That's a point of development at the moment.
-Josh
From: Gregg Thomas @.> Sent: Wednesday, October 4, 2023 9:28 AM To: ncsa/NEAT @.> Cc: Allen, Josh @.>; Comment @.> Subject: Re: [ncsa/NEAT] Easy way to specify not to simulate indels in the mutation model? (Issue #85)
Got it, thanks. I was hoping there'd be some plain text files with the mutation distributions that I could look at, but if the data structures are more complex that's understandable.
Thanks!
— Reply to this email directly, view it on GitHubhttps://urldefense.com/v3/__https://github.com/ncsa/NEAT/issues/85*issuecomment-1746973656__;Iw!!DZ3fjg!4TybJYW3-nq-zJM9CLdPUsLP0mn6Jx89v6VpuNPbspcgbKmhCN7lk4LG4YRBa0tjMRI-82x_w8-6QldfK6jSA7MMasuFww$, or unsubscribehttps://urldefense.com/v3/__https://github.com/notifications/unsubscribe-auth/AGMI7242UGVOVXK7HRZOIHTX5VW7ZAVCNFSM6AAAAAA5P5SXDKVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTONBWHE3TGNRVGY__;!!DZ3fjg!4TybJYW3-nq-zJM9CLdPUsLP0mn6Jx89v6VpuNPbspcgbKmhCN7lk4LG4YRBa0tjMRI-82x_w8-6QldfK6jSA7Np3lGq9A$. You are receiving this because you commented.Message ID: @.***>
Hello, I'm wondering if there is some way that I'm not seeing to tell NEAT to only insert SNPs and not indels while it is simulating reads? I'm using v3.2.
Thanks!