TeselaGen / openVectorEditor

DEPRECATED - Teselagen's Open Source Vector/Plasmid Editor Component
https://teselagen.github.io/tg-oss/ove/#/Editor
MIT License
199 stars 71 forks source link

alternative .csv file types for feature annotation #862

Open GretaHultqvist opened 2 years ago

GretaHultqvist commented 2 years ago

@tnrich I am switshing to use OVE from prevoiusly using Benchling where I had several hundreds of proteins annotated. In benchling you can save features both as proteins and as DNA. Is this possible for OVE? I have my features saved as protiens in 95% of the cases since to me the DNA sequence is not so important but the resulting protein is. If not is this something that you could implement? In benchling you can export all the features that you have annotated and it would be great if this file could be used to auto annotate the proteins in OVE but the files seems in compatible.

The CSV file I export from benchlign looks like below. For me it would be extremely valuable if this file could be read by OVE.

Name,Feature,Type,Color,Match type
15 long linker based on the one I designed,APGSGTGGGSGSAPG,,#85DAE9,protein
20 aa linker based on the one I designed,APGGSGGGTGGGSGGGSAPG,,#C7B0E3,protein
22 aa linker,GPGSGGGGSGGGGSGGGGSGPG,,#f58a5e,protein
38 aa lkong linker,APGGSGGGSGGGSGGGSGGGSGGGTGGGSGGGSAGSPG,,#75C6A9,protein
3d6 variable domain LC,YVVMTQTPLTLSVTIGQPASISCKSSQSLLDSDGKTYLNWLLQRPGQSPKRLIYLVSKLDSGVPDRFTGSGSGTDFTLKISRIEAEDLGLYYCWQGTHFPRTFGGGTKLEIK,,#ff9ccd,protein
3d6 variable HC,EVKLVESGGGLVKPGASLKLSCAASGFTFSNYGMSWVRQNSDKRLEWVASIRSGGGRTYYSDNVKGRFTISRENAKNTLYLQMSSLKSEDTALYYCVRYDHYSGSSDYWGQGTTVTVS,,#f58a5e,protein
tnrich commented 2 years ago

Hi @GretaHultqvist thanks for making a ticket here!

I just want to make sure I understand what you're asking. Are you talking about wanting to be able to complete the following dialog:

image

Using the csv file that is directly exported from benchling and contains mostly protein annotations?

Thanks! Thomas

GretaHultqvist commented 2 years ago

Yes, indeed. That is exactly what I mean. Would be amazing if this could be fixed. Also if there could be a function to add more features to a .csv file so that all new things that you create also can be saved in your .csv file. Greta

tnrich commented 2 years ago

Hi @GretaHultqvist thanks for your response,

I'll see what I can do, this isn't the highest priority for me right now but it also shouldn't be too tough to allow protein annotations to be included in the uploaded CSV for the auto-annotation tool. Fully matching the benchling CSV would be a significant amount of additional work.

Another question I have.. when these annotations get auto-annotated onto the sequence you have open, what should those annotations look like:

  1. Should those annotations be features be of type CDS?
  2. Should those features be "saved as proteins" in some other way? I'm unfamiliar with how the protein annotations work in benchling so some guidance would be needed here.
GretaHultqvist commented 2 years ago

So we design protein therapeutics, which often is antibodies. There we often copy and paste different parts from one antibody to another, and when we are done with the copy and pasting of the different parts we optimize the DNA sequence for expression in human cells and then we want to save this new DNA sequence as the final sequence that we order. If the annotations are then made at nucleotide level then we could not aoutoannotate since the nucleotide sequence change in the optimisataion but not the protein sequence. I have attached an image of how the gene (reading frame)we are expressing was annotaded. So the protein features are just part of the coding sequence. i.e. a linker that we often use or the constant part of one type of antibody, or the variable part. I do not think they need to be saved as proteins, it is more the building blocks that we use. I hope this answers the question. @.***

From: Thomas Rich @.> Reply to: TeselaGen/openVectorEditor @.> Date: Tuesday, 11 October 2022 at 23:35 To: TeselaGen/openVectorEditor @.> Cc: Greta Hultqvist @.>, Mention @.***> Subject: Re: [TeselaGen/openVectorEditor] alternative .csv file types for feature annotation (Issue #862)

Hi @GretaHultqvisthttps://github.com/GretaHultqvist thanks for your response,

I'll see what I can do, this isn't the highest priority for me right now but it also shouldn't be too tough to allow protein annotations to be included in the uploaded CSV for the auto-annotation tool. Fully matching the benchling CSV would be a significant amount of additional work.

Another question I have.. when these annotations get auto-annotated onto the sequence you have open, what should those annotations look like:

  1. Should those annotations be features be of type CDS?
  2. Should those features be "saved as proteins" in some other way? I'm unfamiliar with how the protein annotations work in benchling so some guidance would be needed here.

— Reply to this email directly, view it on GitHubhttps://github.com/TeselaGen/openVectorEditor/issues/862#issuecomment-1275299982, or unsubscribehttps://github.com/notifications/unsubscribe-auth/A3RDFDTUROZSPKNPGJJNESTWCXMRTANCNFSM6AAAAAARBPBS3A. You are receiving this because you were mentioned.Message ID: @.***>

När du har kontakt med oss på Uppsala universitet med e-post så innebär det att vi behandlar dina personuppgifter. För att läsa mer om hur vi gör det kan du läsa här: http://www.uu.se/om-uu/dataskydd-personuppgifter/

E-mailing Uppsala University means that we will process your personal data. For more information on how this is performed, please read here: http://www.uu.se/en/about-uu/data-protection-policy

tnrich commented 2 years ago

@GretaHultqvist could you re-attach the image to your message above? I don't think it made it here.

I think you'll need to do so thru github.com and not via email. Thanks again for your explanation!

GretaHultqvist commented 2 years ago

I think it was this one. Greta @. From: Thomas Rich @.> Reply to: TeselaGen/openVectorEditor @.> Date: Wednesday, 12 October 2022 at 19:18 To: TeselaGen/openVectorEditor @.> Cc: Greta Hultqvist @.>, Mention @.> Subject: Re: [TeselaGen/openVectorEditor] alternative .csv file types for feature annotation (Issue #862)

@GretaHultqvisthttps://github.com/GretaHultqvist could you re-attach the image to your message above? I don't think it made it here.

I think you'll need to do so thru github.com and not via email. Thanks again for your explanation!

— Reply to this email directly, view it on GitHubhttps://github.com/TeselaGen/openVectorEditor/issues/862#issuecomment-1276496992, or unsubscribehttps://github.com/notifications/unsubscribe-auth/A3RDFDWG3W65IANGWXV7ZITWC3XFBANCNFSM6AAAAAARBPBS3A. You are receiving this because you were mentioned.Message ID: @.***>

När du har kontakt med oss på Uppsala universitet med e-post så innebär det att vi behandlar dina personuppgifter. För att läsa mer om hur vi gör det kan du läsa här: http://www.uu.se/om-uu/dataskydd-personuppgifter/

E-mailing Uppsala University means that we will process your personal data. For more information on how this is performed, please read here: http://www.uu.se/en/about-uu/data-protection-policy

tnrich commented 2 years ago

@GretaHultqvist please try uploading it from https://github.com/TeselaGen/openVectorEditor/issues/862, not via email response.

tnrich commented 2 years ago

This is what I'm seeing on my end:

image
GretaHultqvist commented 2 years ago
Screenshot 2022-10-12 at 08 59 42

GretaHultqvist commented 2 years ago

Ah I see. I think I did it now. Greta

From: Thomas Rich @.> Reply to: TeselaGen/openVectorEditor @.> Date: Wednesday, 12 October 2022 at 19:50 To: TeselaGen/openVectorEditor @.> Cc: Greta Hultqvist @.>, Mention @.***> Subject: Re: [TeselaGen/openVectorEditor] alternative .csv file types for feature annotation (Issue #862)

This is what I'm seeing on my end: [Image removed by sender. image]https://user-images.githubusercontent.com/2730609/195413216-1ded174e-f806-42f1-ac4f-ab4442d221f9.png

— Reply to this email directly, view it on GitHubhttps://github.com/TeselaGen/openVectorEditor/issues/862#issuecomment-1276533624, or unsubscribehttps://github.com/notifications/unsubscribe-auth/A3RDFDQ3OFCCN4ZRD6EF6B3WC3263ANCNFSM6AAAAAARBPBS3A. You are receiving this because you were mentioned.Message ID: @.***>

När du har kontakt med oss på Uppsala universitet med e-post så innebär det att vi behandlar dina personuppgifter. För att läsa mer om hur vi gör det kan du läsa här: http://www.uu.se/om-uu/dataskydd-personuppgifter/

E-mailing Uppsala University means that we will process your personal data. For more information on how this is performed, please read here: http://www.uu.se/en/about-uu/data-protection-policy

tnrich commented 2 years ago

Hey @GretaHultqvist,

Please note I have NOT yet added the underlying functionality necessary to support auto annotating PROTEIN sequences. However I have been working on adding an upload CSV wizard to our underlying component system. Here's the issues tracking that

https://github.com/TeselaGen/teselagen-react-components/issues/293 https://github.com/TeselaGen/teselagen-react-components/issues/289

Here's what the v1 experience will look like for you as a user of OVE adding your custom CSV file:

image

After dragging in your CSV file you'll see this:

image

Selecting Feature for the sequence field:

image

After hitting the Review and Edit Data button:

image

After hitting Add File:

image

Let me know if this workflow makes sense to you. Once I have time to add the underlying protein auto-annotate functionality I'll let you know and enable it in conjunction with this feature.

Cheers! Thomas

tnrich commented 1 year ago

@GretaHultqvist are you using the desktop application of OVE?

GretaHultqvist commented 1 year ago

Yes, I am. Greta

From: Thomas Rich @.> Reply to: TeselaGen/openVectorEditor @.> Date: Friday, 18 November 2022 at 21:48 To: TeselaGen/openVectorEditor @.> Cc: Greta Hultqvist @.>, Mention @.***> Subject: Re: [TeselaGen/openVectorEditor] alternative .csv file types for feature annotation (Issue #862)

@GretaHultqvisthttps://github.com/GretaHultqvist are you using the desktop application of OVE?

— Reply to this email directly, view it on GitHubhttps://github.com/TeselaGen/openVectorEditor/issues/862#issuecomment-1320507414, or unsubscribehttps://github.com/notifications/unsubscribe-auth/A3RDFDV4T62A446UJUPQ5NLWI7TQXANCNFSM6AAAAAARBPBS3A. You are receiving this because you were mentioned.Message ID: @.***>

När du har kontakt med oss på Uppsala universitet med e-post så innebär det att vi behandlar dina personuppgifter. För att läsa mer om hur vi gör det kan du läsa här: http://www.uu.se/om-uu/dataskydd-personuppgifter/

E-mailing Uppsala University means that we will process your personal data. For more information on how this is performed, please read here: http://www.uu.se/en/about-uu/data-protection-policy

tnrich commented 1 year ago

@GretaHultqvist would you mind trying out the new release here and see if my changes solve this issue for you? https://github.com/TeselaGen/ove-electron/releases/tag/v1.5.3

Cheers!

GretaHultqvist commented 1 year ago

I will. I have a new M2 mac, will the M1 mac version work for that? (old computer was stolen). Greta

From: Thomas Rich @.> Reply to: TeselaGen/openVectorEditor @.> Date: Tuesday, 29 November 2022 at 01:00 To: TeselaGen/openVectorEditor @.> Cc: Greta Hultqvist @.>, Mention @.***> Subject: Re: [TeselaGen/openVectorEditor] alternative .csv file types for feature annotation (Issue #862)

@GretaHultqvisthttps://github.com/GretaHultqvist would you mind trying out the new release here and see if my changes solve this issue for you? https://github.com/TeselaGen/ove-electron/releases/tag/v1.5.3

Cheers!

— Reply to this email directly, view it on GitHubhttps://github.com/TeselaGen/openVectorEditor/issues/862#issuecomment-1329900531, or unsubscribehttps://github.com/notifications/unsubscribe-auth/A3RDFDR23BQKCDVH67BHCXLWKVBTPANCNFSM6AAAAAARBPBS3A. You are receiving this because you were mentioned.Message ID: @.***>

När du har kontakt med oss på Uppsala universitet med e-post så innebär det att vi behandlar dina personuppgifter. För att läsa mer om hur vi gör det kan du läsa här: http://www.uu.se/om-uu/dataskydd-personuppgifter/

E-mailing Uppsala University means that we will process your personal data. For more information on how this is performed, please read here: http://www.uu.se/en/about-uu/data-protection-policy

GretaHultqvist commented 1 year ago

Screenshot 2022-12-01 at 14 23 18 So I get this error message when trying to install. Could it be due to the M2 processor?

tnrich commented 1 year ago

Hmm shouldn't be seeing that. What happens if you right click the downloaded file and ask to open it? Maybe something is wrong with the latest release however.. I'll look into it today. Thanks!

tnrich commented 1 year ago

@GretaHultqvist hmm looks like something changed between releases.. I'll need to look into it further but I am seeing the same thing on my m1 mac. I'll un-publish that release now since it appears to be corrupted.

GretaHultqvist commented 1 year ago

Ok great. Let me know when I should try again. Greta

Sent from my Galaxy

-------- Original message -------- From: Thomas Rich @.> Date: 01/12/2022 18:17 (GMT+01:00) To: TeselaGen/openVectorEditor @.> Cc: Greta Hultqvist @.>, Mention @.> Subject: Re: [TeselaGen/openVectorEditor] alternative .csv file types for feature annotation (Issue #862)

@GretaHultqvisthttps://github.com/GretaHultqvist hmm looks like something changed between releases.. I'll need to look into it further but I am seeing the same thing on my m1 mac. I'll un-publish that release now since it appears to be corrupted.

— Reply to this email directly, view it on GitHubhttps://github.com/TeselaGen/openVectorEditor/issues/862#issuecomment-1334096639, or unsubscribehttps://github.com/notifications/unsubscribe-auth/A3RDFDUJTJWIYRG25L7HHI3WLDMQXANCNFSM6AAAAAARBPBS3A. You are receiving this because you were mentioned.Message ID: @.***>

När du har kontakt med oss på Uppsala universitet med e-post så innebär det att vi behandlar dina personuppgifter. För att läsa mer om hur vi gör det kan du läsa här: http://www.uu.se/om-uu/dataskydd-personuppgifter/

E-mailing Uppsala University means that we will process your personal data. For more information on how this is performed, please read here: http://www.uu.se/en/about-uu/data-protection-policy

tnrich commented 1 year ago

Hey @GretaHultqvist could you try out https://github.com/TeselaGen/ove-electron/releases/tag/v1.5.4 and see if it works for you? You'll ned to right click the app the first time you try to run it.

GretaHultqvist commented 1 year ago

The program works. Not sure about the protein feature annotations….

From: Thomas Rich @.> Reply to: TeselaGen/openVectorEditor @.> Date: Thursday, 1 December 2022 at 19:15 To: TeselaGen/openVectorEditor @.> Cc: Greta Hultqvist @.>, Mention @.***> Subject: Re: [TeselaGen/openVectorEditor] alternative .csv file types for feature annotation (Issue #862)

Hey @GretaHultqvisthttps://github.com/GretaHultqvist could you try out https://github.com/TeselaGen/ove-electron/releases/tag/v1.5.4 and see if it works for you? You'll ned to right click the app the first time you try to run it.

— Reply to this email directly, view it on GitHubhttps://github.com/TeselaGen/openVectorEditor/issues/862#issuecomment-1334165563, or unsubscribehttps://github.com/notifications/unsubscribe-auth/A3RDFDVZNN33LKR6N27WHBTWLDTM5ANCNFSM6AAAAAARBPBS3A. You are receiving this because you were mentioned.Message ID: @.***>

När du har kontakt med oss på Uppsala universitet med e-post så innebär det att vi behandlar dina personuppgifter. För att läsa mer om hur vi gör det kan du läsa här: http://www.uu.se/om-uu/dataskydd-personuppgifter/

E-mailing Uppsala University means that we will process your personal data. For more information on how this is performed, please read here: http://www.uu.se/en/about-uu/data-protection-policy

GretaHultqvist commented 1 year ago

So I tried to annotate a bit of the sequence, but did not see that I could select protein there. Also looked at what type of file I needed if I wanted to auto annotate, and no protein selection there either. So not completely solved. Where exactly have you made the changes? Should it perhaps automatically detect if it is protein or DNA? Greta

From: Thomas Rich @.> Reply to: TeselaGen/openVectorEditor @.> Date: Thursday, 1 December 2022 at 19:15 To: TeselaGen/openVectorEditor @.> Cc: Greta Hultqvist @.>, Mention @.***> Subject: Re: [TeselaGen/openVectorEditor] alternative .csv file types for feature annotation (Issue #862)

Hey @GretaHultqvisthttps://github.com/GretaHultqvist could you try out https://github.com/TeselaGen/ove-electron/releases/tag/v1.5.4 and see if it works for you? You'll ned to right click the app the first time you try to run it.

— Reply to this email directly, view it on GitHubhttps://github.com/TeselaGen/openVectorEditor/issues/862#issuecomment-1334165563, or unsubscribehttps://github.com/notifications/unsubscribe-auth/A3RDFDVZNN33LKR6N27WHBTWLDTM5ANCNFSM6AAAAAARBPBS3A. You are receiving this because you were mentioned.Message ID: @.***>

När du har kontakt med oss på Uppsala universitet med e-post så innebär det att vi behandlar dina personuppgifter. För att läsa mer om hur vi gör det kan du läsa här: http://www.uu.se/om-uu/dataskydd-personuppgifter/

E-mailing Uppsala University means that we will process your personal data. For more information on how this is performed, please read here: http://www.uu.se/en/about-uu/data-protection-policy

tnrich commented 1 year ago

@GretaHultqvist whoops looks like the auto annotate enhancement that adds protein matching to OVE didn't make it into the release I just published. Not sure why that didn't happen but I must not have updated the dep. Let me update everything and republish. At that point you should see a new column here: image

I'll let you know when the new release comes out

tnrich commented 1 year ago

Ok, this should be the updated version here @GretaHultqvist https://github.com/TeselaGen/ove-electron/releases/tag/v1.5.5

GretaHultqvist commented 1 year ago

I have tested version 1.5.5 now and still not sure how to get the protein feature or annotation. Screenshot 2022-12-02 at 10 22 18

When I for instance choose new feature in the screen shot above I get these options Screenshot 2022-12-02 at 10 24 14 Which does not have an option to select save as protein. Is is somewhere else I should be looking? Greta

tnrich commented 1 year ago

@GretaHultqvist it seems like there is some other feature you're expecting to see that you haven't yet described.. All I added was a way to auto-annotate new features using a CSV with a match type of "protein". I also made it so if you used a file with slightly different formatting, the import system would try its best to match up the headers and allow you to edit the data if needed.

Here is the standard auto-annotate flow:

  1. File > Auto Annotate > Auto Annotate Features

  2. image
  3. Upload a file with the matchType column set to 'protein':

image

  1. That should allow features to be auto-annotated on the sequence via an Amino Acid string

Note you can also use the CSV file you pasted in your first comment, the one you were asking to be able to be imported into OVE:

The CSV file I export from benchling looks like below. For me it would be extremely valuable if this file could be read by OVE.

Name,Feature,Type,Color,Match type
15 long linker based on the one I designed,APGSGTGGGSGSAPG,,#85DAE9,protein
20 aa linker based on the one I designed,APGGSGGGTGGGSGGGSAPG,,#C7B0E3,protein
22 aa linker,GPGSGGGGSGGGGSGGGGSGPG,,#f58a5e,protein
38 aa lkong linker,APGGSGGGSGGGSGGGSGGGSGGGTGGGSGGGSAGSPG,,#75C6A9,protein
3d6 variable domain LC,YVVMTQTPLTLSVTIGQPASISCKSSQSLLDSDGKTYLNWLLQRPGQSPKRLIYLVSKLDSGVPDRFTGSGSGTDFTLKISRIEAEDLGLYYCWQGTHFPRTFGGGTKLEIK,,#ff9ccd,protein
3d6 variable HC,EVKLVESGGGLVKPGASLKLSCAASGFTFSNYGMSWVRQNSDKRLEWVASIRSGGGRTYYSDNVKGRFTISRENAKNTLYLQMSSLKSEDTALYYCVRYDHYSGSSDYWGQGTTVTVS,,#f58a5e,protein

Please let me know if you still have any questions with the new protein auto annotation flow