MathildeFogPerez / manuscript-rep-phad

Scripts to process 10X BCR data and create an AIRR input file (to upload into RepSeq platform)
Apache License 2.0
2 stars 1 forks source link

No matching between fastq and metadata in the script. #1

Open dakomura opened 1 year ago

dakomura commented 1 year ago

Hi there,

Thank you for sharing the precious dataset.

I downloaded all the fastq files from the following link and applied Cellranger VDJ. https://www.ebi.ac.uk/biostudies/arrayexpress/studies/E-MTAB-11697 https://www.ebi.ac.uk/biostudies/arrayexpress/studies/E-MTAB-11174

Then I tried to apply your script to convert the data into AIRR file format, but I could not match the sample IDs for some samples. For example, D2_files/metadata_D2.txt contains s101a (sample) and BT1p1a (newSampleId), but I found these IDs in neither in https://www.ebi.ac.uk/biostudies/files/E-MTAB-11174/E-MTAB-11174.sdrf.txt nor https://www.ebi.ac.uk/biostudies/files/E-MTAB-11697/E-MTAB-11697.sdrf.txt.

How can I match IDs for such samples?

MathildeFogPerez commented 1 year ago

Hi,

I was not in charge of uploading the data on public database since I've changed job at that time. It is right that the sample Ids on ArrayExpress do not correspond to the list I gave here. Nevertheless you can easily match them. The one you are looking for is 'Donor2_PC_may2020' in ArrayExpress. The metadata.txt and samples.txt files can be changed accordingly to the sample names you will give after running cell ranger. It should not be a problem.

best, Mathilde