Closed charles-cowart closed 4 months ago
@charles-cowart I use it to confirm whether or not the i2 barcodes are in the forward or reverse orientation when checking sample sheets. This is important because the NovaSeq 6000 should be in the reverse (BarcodesAreRC true) and the NovaSeq X should be in the forward (BarcodesAreRC false). If the samplesheet doesn't reflect the correct orientation, I need to change the barcodes to the correct orientation. @RodolfoSalido - Orientation is still accurately reflected in that column, correct?
Unfortunately, as far as I know, the value for that field is currently populated from user input and is error prone. I had mentioned this before but not enough attention was put on it since that field is not used in any automated process from the SPP (correct me if I’m wrong).
I think the best solution going forward would be to populate that field automatically as part of the make_sample_sheet() function, which already checks against a list of REVCOMP_SEQUENCERS to check if BarcodeAreRC.
Sounds like the BarcodesAreRC boolean is mostly for humans to check if the samplesheet has reverse complemented i5 when troubleshooting.
-Rodolfo
On May 8, 2024, at 6:54 AM, mmbryant23 @.***> wrote:
@charles-cowart https://github.com/charles-cowart I use it to confirm whether or not the i2 barcodes are in the forward or reverse orientation when checking sample sheets. This is important because the NovaSeq 6000 should be in the reverse (BarcodesAreRC true) and the NovaSeq X should be in the forward (BarcodesAreRC false). If the samplesheet doesn't reflect the correct orientation, I need to change the barcodes to the correct orientation. @RodolfoSalido https://github.com/RodolfoSalido - Orientation is still accurately reflected in that column, correct?
— Reply to this email directly, view it on GitHub https://github.com/biocore/metagenomics_pooling_notebook/issues/202#issuecomment-2100634202, or unsubscribe https://github.com/notifications/unsubscribe-auth/ADEVFI4ADWAYIEH5B7QT6ZLZBIVBDAVCNFSM6AAAAABHMGF7VGVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDCMBQGYZTIMRQGI. You are receiving this because you were mentioned.
@RodolfoSalido Agreed, that should be automatic. How is someone currently able to check before processing that i2 is reverse or forward?
I don’t think wetlab techs check. The function does the i5 orientation automatically but the BarcodesAreRC field doesn’t get updated accordingly. The BarcodesAreRC booleans gets populated from a user facing Jupyter form with a Bioinformatics dictionary.
-Rodolfo
On Wed, May 8, 2024 at 9:42 AM mmbryant23 @.***> wrote:
@RodolfoSalido https://github.com/RodolfoSalido Agreed, that should be automatic. How is someone currently able to check before processing that i2 is reverse or forward?
— Reply to this email directly, view it on GitHub https://github.com/biocore/metagenomics_pooling_notebook/issues/202#issuecomment-2100984778, or unsubscribe https://github.com/notifications/unsubscribe-auth/ADEVFI3RGS7QYQRFGNAHIN3ZBJIYNAVCNFSM6AAAAABHMGF7VGVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDCMBQHE4DINZXHA . You are receiving this because you were mentioned.Message ID: @.***>
For what it's worth, this is the only place the column gets accessed in Metapool: https://github.com/biocore/metagenomics_pooling_notebook/blob/06012646b9f1b24338700deaa20f3619b08bc906/metapool/sample_sheet.py#L438-L439
REVCOMP_SEQUENCERS is defined here: https://github.com/biocore/metagenomics_pooling_notebook/blob/06012646b9f1b24338700deaa20f3619b08bc906/metapool/metapool.py#L17
Based on what you guys just said, and the code I highlighted, it seems like make_sample_sheet() is already accurately setting that value. Lines 438-439 are part of _add_data_to_sheet() method, which is used by make_sample_sheet(). It will be set to True if the value for sequencer passed to make_sample_sheet() is in the list of REVCOMP_SEQUENCERS and False otherwise.
It seems that the value will be accurate, at least for a sheet made using make_sample_sheet(). If the user alters it afterward that could be an issue but it sounds like most of the time they would be accurate after all?
Cool, I think you are right Charlie!
I was under the impression the value was user provided because it is a field in the sample_sheet form, but it is nice to see it is actually automatically populated.
-Rodolfo
On May 8, 2024, at 6:54 AM, mmbryant23 @.***> wrote:
@charles-cowart https://github.com/charles-cowart I use it to confirm whether or not the i2 barcodes are in the forward or reverse orientation when checking sample sheets. This is important because the NovaSeq 6000 should be in the reverse (BarcodesAreRC true) and the NovaSeq X should be in the forward (BarcodesAreRC false). If the samplesheet doesn't reflect the correct orientation, I need to change the barcodes to the correct orientation. @RodolfoSalido https://github.com/RodolfoSalido - Orientation is still accurately reflected in that column, correct?
— Reply to this email directly, view it on GitHub https://github.com/biocore/metagenomics_pooling_notebook/issues/202#issuecomment-2100634202, or unsubscribe https://github.com/notifications/unsubscribe-auth/ADEVFI4ADWAYIEH5B7QT6ZLZBIVBDAVCNFSM6AAAAABHMGF7VGVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDCMBQGYZTIMRQGI. You are receiving this because you were mentioned.
Thanks Rodolfo and MacKenzie! It sounds like we can close this issue then.
I believe BarcodesAreRC in the Bioinformatics section of sample-sheets is no longer used. It's before my time but there's not much mention of it in the codebase. It's not mentioned in the bcl-convert handbook so I don't believe it's required for converting files to fastq. I'm raising it as an issue before @RodolfoSalido leaves, just in case he is the most knowledgeable person about it.
@mmbryant23 do you know if we still need this column?