ramachandran-lab / pong

Fast analysis and visualization of latent clusters in population genetic data
64 stars 11 forks source link

Problem with ind2pop file. #14

Closed Hjorvik closed 1 year ago

Hjorvik commented 4 years ago

Hi!

I'm trying to run pong, but when specifying the ind2pop file I keep getting this error message: "error: individual population file assignment contains more than 1 column of data"

However, my file consist of only one column with the populations of each individual without spaces.

Am I doing anything wrong?

shahamat commented 4 years ago

I think this error might be incorrectly triggered if the number of lines in your ind2pop file is not the same as the number of individuals in your Q matrix. As a first step, can you please check how many individuals are in the ind2pop file and how many are in your Q matrix file?

guhanrv commented 4 years ago

@shahamat - I also have this same issue.

I am running

pong -m ukbb_cob_sub_filemap.tsv.txt -i ukbb_cob_sub_ind2pop.tsv.txt -n ukbb_cob_sub_pop_names.tsv.txt

I checked to see that these files have the same line count. I am attaching the files here to see if you can reproduce the issue. Much appreciated!

ukbb_cob_sub_pop_names.tsv.txt ukbb_cob_sub.8.Q.txt ukbb_cob_sub_filemap.tsv.txt ukbb_cob_sub_ind2pop.tsv.txt

shahamat commented 4 years ago

I think the issue might be with the Q matrix file (ukbb_cob_sub.8.Q.txt). When I open the file, I see 16542 empty lines at the bottom of the file. Therefore pong thinks that there are 267098 individuals (instead of 283640 individuals) and expects the ind2pop file to also have 267098 labels. Perhaps something went wrong with the Q matrix generation process?

guhanrv commented 4 years ago

@shahamat - thanks for pointing that out! resolved.

shahamat commented 4 years ago

Awesome!

quinn-ca commented 1 year ago

Hello, I'm also getting this error. I generated my .Q matrices using admixture_wrapper.py. I confirmed my ind2pop file has a single column and that the .Q matrices have the same number of lines as the ind2pop file. I've run pong successfully before but I'm not sure what else I can troubleshoot. I'd appreciate your help!

shahamat commented 1 year ago

Would you be able to share a Q matrix and your ind2pop file (or parts of it)? And also the command you are using to run pong when you run into this error.

quinn-ca commented 1 year ago

Yes, thank you! The code I ran is below. I've run it with all items called by the code in the same directory and with full paths to all files, but get the same error.

pong -m pong_filemap_autosomal_genstr_rmRelated_wrapper -i popID_list_relatedremoved -n names_fourpops_order

renamed_chrs_rmATPUsampleID_populations_snps_rmRelated_finalPrunedData.5.9.txt popID_list_relatedremoved.txt

shahamat commented 1 year ago

I was able to get it to run with the two files you provided using the attached filemap. How many Q matrices are in your filemap? And also what version of pong are you using? You can see that by running: pong -h and look for the line that says -- pong, vX.X

filemap.txt

quinn-ca commented 1 year ago

I'm running Pong v1.5. I have 500 .Q matrices (K=1 to K=5, 100 iterations), which matches the number of lines in my filemap (attached). pong_filemap_autosomal_genstr_rmRelated_wrapper.txt

shahamat commented 1 year ago

Let's try two things. (1) Can you attach the last Q matrix that is in that filemap? (2) Does pong run successfully if you remove the -i and -n options, as in just run it with the filemap?

quinn-ca commented 1 year ago

I just realized that the filepath was outdated from an earlier run! My excel sheet to make the paths updated part of it, but not fully. I've fixed the filepath and updated the filemap and now pong is running successfully. From the error, I was focusing too much on the ind2pop file. Thanks so much for your help and sorry that it was just a mistake on my end.

shahamat commented 1 year ago

No worries, glad it's working now!