ShouWenWang-Lab / Custom_CARLIN

1 stars 1 forks source link

Some Questions about the CARLIN sequences #1

Open jefferyUstc opened 6 months ago

jefferyUstc commented 6 months ago

Hi, DARLIN Team,

Take cCARLIN sequence for an example, (1) what's the meaning of "SecondarySequence"?

Based on the above sequences, the exact structure of Read2 seems to be:

prefix + Primer5 + [segment+PAM]*10+SecondarySequence+Primer3+postfix

so, In the preprocessing codes of MosaicLineage, it's seems a trick to extract the sequences that belongs to:

[segment+PAM]*10

However, the sequences of Primer3 in these two different repo seem not to be the same.

In MosaicLineage:

CC_5prime = "AGCTGTACAAGTAAGCGGC"
CC_3prime = "AGAATTCTAACTAGAGCTCGCTGATCAGCCTCGACTGTGCCTTCT"

In Custom_CARLIN:

Primer5 = 'GAGCTGTACAAGTAAGCGGC'; % remember to update obj.match_score.Primer5;
Primer3 = 'CGACTGTGCCTTCTAGTTGC';
SecondarySequence = 'AGAATTCTAACTAGAGCTCGCTGATCAGCCT';

(2) why?is this related to the source of SecondarySequence? I mean, How were primer5, primer3 and SecondarySequence generated? Are they conserved sequences in the Array or introduced during amplification?

ShouWenWang commented 6 months ago

These are conserved sequences in the array, and are used to extract the actual lineage barcodes, and also served as quality control of the sequence quality.

The difference between the two repo is because there are different ways for doing the sequence extraction and quality control

―― Shou-Wen Wang, PhD Principal Investigator School of Life Sciences | School of Sciences Westlake University Shilongshan ST #18, Xihu, Hangzhou, Zhejiang https://www.shouwenwang-lab.com/


From: jeffery @.> Sent: Saturday, January 6, 2024 3:44:20 PM To: ShouWenWang-Lab/Custom_CARLIN @.> Cc: Subscribed @.***> Subject: [ShouWenWang-Lab/Custom_CARLIN] Some Questions about the CARLIN sequences (Issue #1)

You don't often get email from @.*** Learn why this is importanthttps://aka.ms/LearnAboutSenderIdentification

Hi, DARLIN Team,

Take cCARLINhttps://github.com/ShouWenWang-Lab/Custom_CARLIN/blob/ffe5c8a3abad88246031cae203fe54704b4fddba/%40CARLIN_def/CARLIN_def_cCARLIN.m#L3 sequence for an example, (1) what's the meaning of "SecondarySequence"?

Based on the above sequences, the exact structure of Read2 seems to be:

prefix + Primer5 + [segment+PAM]*10+SecondarySequence+Primer3+postfix

so, In the preprocessing codes of MosaicLineagehttps://github.com/ShouWenWang-Lab/MosaicLineage/blob/f59a4b9067bd342414a405145490d6d35a2d628b/mosaiclineage/DARLIN.py#L262, it's seems a trick to extract the sequences that belongs to:

[segment+PAM]*10

However, the sequences of Primer3 in these two different repo seem not to be the same.

In MosaicLineage:

CC_5prime = "AGCTGTACAAGTAAGCGGC" CC_3prime = "AGAATTCTAACTAGAGCTCGCTGATCAGCCTCGACTGTGCCTTCT"

In Custom_CARLIN:

Primer5 = 'GAGCTGTACAAGTAAGCGGC'; % remember to update obj.match_score.Primer5; Primer3 = 'CGACTGTGCCTTCTAGTTGC'; SecondarySequence = 'AGAATTCTAACTAGAGCTCGCTGATCAGCCT';

(2) why?is this related to the source of SecondarySequence? I mean, How were primer5, primer3 and SecondarySequence generated? Are they conserved sequences in the Array or introduced during amplification?

― Reply to this email directly, view it on GitHubhttps://github.com/ShouWenWang-Lab/Custom_CARLIN/issues/1, or unsubscribehttps://github.com/notifications/unsubscribe-auth/ABDCASVHBT6LBJ5UBARXIEDYND6FDAVCNFSM6AAAAABBPMAGFSVHI2DSMVQWIX3LMV43ASLTON2WKOZSGA3DQNBVGYYTSOA. You are receiving this because you are subscribed to this thread.Message ID: @.***>