cliu32 / athlates

5 stars 0 forks source link

hla database updata #1

Open shiwanyin opened 2 years ago

shiwanyin commented 2 years ago

Hi,cliu32 After i updata hla database to 3.46, I met an error "err: process_block: block size != refs size" when i use athlates typing function. I try some times and still get this error. could you give me some advices ?

cliu32 commented 2 years ago

You may want to inspect the msa files, especially toward the end of the alignment, and manually delete the sequences that extend beyond the alignment blocks. good luck!

On Fri, Apr 8, 2022 at 12:59 AM shiwanyin @.***> wrote:

Hi,cliu32 After i updata hla database to 3.46, I met an error "err: process_block: block size != refs size" when i use athlates typing function. I try some times and still get this error. could you give me some advices ?

— Reply to this email directly, view it on GitHub https://github.com/cliu32/athlates/issues/1, or unsubscribe https://github.com/notifications/unsubscribe-auth/ACOAQWOCNQAALERAHQTDEZTVD7DLDANCNFSM5S3PK5IA . You are receiving this because you are subscribed to this thread.Message ID: @.***>

shiwanyin commented 2 years ago

Thanks for your apply. I still feel confused about the 'alignment blocks'. I directly get the msa file from imgt hla database without any change. i use your code to get ref and bed files and the the input of your code is simply the combine of gDNA fasta and cDNA fasta. what i confused is that i do not know what is the 'alignment block' does your mean.

cliu32 commented 2 years ago

The IMGT MSA format is not standard, for example, the file for A may look like the following. The last line is due to a frameshift and is not part of any alignment. You may delete the last few lines and try again. Other loci may be affected as well and need manual editing. Good luck. ... A*80:04 --- --- --- --- --- --- --- --- --- -|-- ---

A*80:05 --- --- --- --- --- --- --- --- --- -|-- ---

A*80:06 --- --- --- --- --- --- --- --- --- -|-- ---

A*80:07 --- --- --- --- --- --- --- --- --- -|-- ---

A*80:08N --- --- --- --- --- --- --- --- --- -|-- ---

A*80:09N | ***

cDNA 1098 AA codon 342 | A*03:437Q GACAG CTGCC TTGTG TGGGA CTGA

On Thu, Apr 14, 2022 at 2:48 AM shiwanyin @.***> wrote:

Thanks for your apply. I still feel confused about the 'alignment blocks'. I directly get the msa file from imgt hla database without any change. i use your code to get ref and bed files and the the input of your code is simply the combine of gDNA fasta and cDNA fasta. what i confused is that i do not know what is the 'alignment block' does your mean.

— Reply to this email directly, view it on GitHub https://github.com/cliu32/athlates/issues/1#issuecomment-1098809951, or unsubscribe https://github.com/notifications/unsubscribe-auth/ACOAQWL7SBU6ZLNCGXVXDF3VE7ETNANCNFSM5S3PK5IA . You are receiving this because you commented.Message ID: @.***>

shiwanyin commented 2 years ago

The IMGT MSA format is not standard, for example, the file for A may look like the following. The last line is due to a frameshift and is not part of any alignment. You may delete the last few lines and try again. Other loci may be affected as well and need manual editing. Good luck. ... A80:04 --- --- --- --- --- --- --- --- --- -|-- --- A80:05 --- --- --- --- --- --- --- --- --- -|-- --- A80:06 --- --- --- --- --- --- --- --- --- -|-- --- A80:07 --- --- --- --- --- --- --- --- --- -|-- --- A80:08N --- --- --- --- --- --- --- --- --- -|-- --- A80:09N | cDNA 1098 AA codon 342 | A03:437Q GACAG CTGCC TTGTG TGGGA CTGA On Thu, Apr 14, 2022 at 2:48 AM shiwanyin **@.> wrote: Thanks for your apply. I still feel confused about the 'alignment blocks'. I directly get the msa file from imgt hla database without any change. i use your code to get ref and bed files and the the input of your code is simply the combine of gDNA fasta and cDNA fasta. what i confused is that i do not know what is the 'alignment block' does your mean. — Reply to this email directly, view it on GitHub <#1 (comment)>, or unsubscribe https://github.com/notifications/unsubscribe-auth/ACOAQWL7SBU6ZLNCGXVXDF3VE7ETNANCNFSM5S3PK5IA . You are receiving this because you commented.Message ID: @.> thanks ! i edit my MSA file with your solution, it works on IMGT/HLA 3.29 release. but it still get the same error when i use the latest release IMGT/HLA database.

cliu32 commented 2 years ago

which locus? did you edit all msa files affected by the non-standard formatting? sorry to hear about the trouble. in the worst-case scenario, I don't think you need to update the database if this is for research only.

On Tue, May 10, 2022 at 3:06 AM shiwanyin @.***> wrote:

The IMGT MSA format is not standard, for example, the file for A may look like the following. The last line is due to a frameshift and is not part of any alignment. You may delete the last few lines and try again. Other loci may be affected as well and need manual editing. Good luck. ... A80:04 --- --- --- --- --- --- --- --- --- -|-- --- A80:05 --- --- --- --- --- --- --- --- --- -|-- --- A80:06 --- --- --- --- --- --- --- --- --- -|-- --- A80:07 --- --- --- --- --- --- --- --- --- -|-- --- A80:08N --- --- --- --- --- --- --- --- --- -|-- --- A80:09N | *** cDNA 1098 AA codon 342 | A

*03:437Q GACAG CTGCC TTGTG TGGGA CTGA … <#m6298155279951270454> On Thu, Apr 14, 2022 at 2:48 AM shiwanyin @.> wrote: Thanks for your apply. I still feel confused about the 'alignment blocks'. I directly get the msa file from imgt hla database without any change. i use your code to get ref and bed files and the the input of your code is simply the combine of gDNA fasta and cDNA fasta. what i confused is that i do not know what is the 'alignment block' does your mean. — Reply to this email directly, view it on GitHub <#1 (comment) https://github.com/cliu32/athlates/issues/1#issuecomment-1098809951>, or unsubscribe https://github.com/notifications/unsubscribe-auth/ACOAQWL7SBU6ZLNCGXVXDF3VE7ETNANCNFSM5S3PK5IA https://github.com/notifications/unsubscribe-auth/ACOAQWL7SBU6ZLNCGXVXDF3VE7ETNANCNFSM5S3PK5IA . You are receiving this because you commented.Message ID: @.> thanks ! i edit my MSA file with your solution, it works on IMGT/HLA 3.29 release. but it still get the same error when i use the latest release IMGT/HLA database.

— Reply to this email directly, view it on GitHub https://github.com/cliu32/athlates/issues/1#issuecomment-1122065060, or unsubscribe https://github.com/notifications/unsubscribe-auth/ACOAQWODMULUSEROQP2I4Y3VJIKJZANCNFSM5S3PK5IA . You are receiving this because you commented.Message ID: @.***>

shiwanyin commented 2 years ago

yes, i removed all msa files affected by frameshift that appeared at the end of msa file. I guess there are other not-standard formats in the latest msa files. But I can not figure out them. well, I not only use it for research but for clinical in some situation

shiwanyin commented 1 year ago

Thanks for your apply. I still feel confused about the 'alignment blocks'. I directly get the msa file from imgt hla database without any change. i use your code to get ref and bed files and the the input of your code is simply the combine of gDNA fasta and cDNA fasta. what i confused is that i do not know what is the 'alignment block' does your mean.

At 2022-04-09 21:03:01, "Chang Liu" @.***> wrote:

You may want to inspect the msa files, especially toward the end of the alignment, and manually delete the sequences that extend beyond the alignment blocks. good luck!

On Fri, Apr 8, 2022 at 12:59 AM shiwanyin @.***> wrote:

Hi,cliu32 After i updata hla database to 3.46, I met an error "err: process_block: block size != refs size" when i use athlates typing function. I try some times and still get this error. could you give me some advices ?

— Reply to this email directly, view it on GitHub https://github.com/cliu32/athlates/issues/1, or unsubscribe https://github.com/notifications/unsubscribe-auth/ACOAQWOCNQAALERAHQTDEZTVD7DLDANCNFSM5S3PK5IA . You are receiving this because you are subscribed to this thread.Message ID: @.***>

— Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you authored the thread.Message ID: @.***>

cliu32 commented 1 year ago

[image: image.png] Attached is an example for HLA-C nuc alignment. Three alleles have extended transcripts due to frameshift, which are outside of the standard alignment blocks above. Try deleting the extra sequence and see what happens. Good luck.

On Tue, Oct 11, 2022 at 3:47 AM shiwanyin @.***> wrote:

Thanks for your apply. I still feel confused about the 'alignment blocks'. I directly get the msa file from imgt hla database without any change. i use your code to get ref and bed files and the the input of your code is simply the combine of gDNA fasta and cDNA fasta. what i confused is that i do not know what is the 'alignment block' does your mean.

At 2022-04-09 21:03:01, "Chang Liu" @.***> wrote:

You may want to inspect the msa files, especially toward the end of the alignment, and manually delete the sequences that extend beyond the alignment blocks. good luck!

On Fri, Apr 8, 2022 at 12:59 AM shiwanyin @.***> wrote:

Hi,cliu32 After i updata hla database to 3.46, I met an error "err: process_block: block size != refs size" when i use athlates typing function. I try some times and still get this error. could you give me some advices ?

— Reply to this email directly, view it on GitHub https://github.com/cliu32/athlates/issues/1, or unsubscribe < https://github.com/notifications/unsubscribe-auth/ACOAQWOCNQAALERAHQTDEZTVD7DLDANCNFSM5S3PK5IA

. You are receiving this because you are subscribed to this thread.Message ID: @.***>

— Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you authored the thread.Message ID: @.***>

— Reply to this email directly, view it on GitHub https://github.com/cliu32/athlates/issues/1#issuecomment-1274330465, or unsubscribe https://github.com/notifications/unsubscribe-auth/ACOAQWK3TKBHIHZFO62Y2TLWCUSRFANCNFSM5S3PK5IA . You are receiving this because you commented.Message ID: @.***>