hakyim / TO-DELETE-PrediXcan

Code for the in-dev PrediXcan Project
MIT License
28 stars 82 forks source link

Error with dosage/samples.txt files #35

Closed simonlee184 closed 4 years ago

simonlee184 commented 4 years ago

So my dosage files contain lines where there is no rs ID and predixcan can't handle these rows. So I removed those lines and tried running PrediXcan and received an error as seen below:

ERROR: There are not enough rows in your sample file! Make sure dosage files and sample files have the same number of individuals in the same order.

Would you happen to have a solution for this?

Heroico commented 4 years ago

Hi there!

There seem to be two unrelated issues.

1) Regarding the "lines with no rsids", PrediXcan is very strict with the input formats. i.e.: the dosage file format:chromosome rsid position allele1 allele2 MAF id1 ..... idn requires the 5 first fixed fields, followed by an entry for every individual in your sample. All fields must be present (although rsid doesn't strictly need to be an rsid: it can be any string such as chr1:123:C:T

Removing lines without rsids is an option, but I suggest identifying the variants with no rsid and identifying a proper id.

2) The ERROR: ... message might be confusing. It means that the number of individuals found in the samples file is less than the one found on the genotype dosage file.

simonlee184 commented 4 years ago

So I'm a bit confused as to how it knows if the samples file has fewer samples than what's found in the genotype dosage file. They have a different number of rows and there's no information about the sample name in the dosage files even in the examples in the PredixcanExample directory.

Heroico commented 4 years ago

The number of samples in the genotype dosage file is the number of fields after chromosome rsid position allele1 allele2 MAF. Any column after MAF is a sample. In the sample file, each sample gets listed in a row. The sample file must have an entry for every sample column in the genotype file.

Heroico commented 4 years ago

Closing due to inactivity.