nickallsing / TJ_River_Project

0 stars 1 forks source link

Scripts for alpha diversity #1

Closed EorgeKit closed 2 months ago

EorgeKit commented 6 months ago

Hi @nickallsing @nallsing-salk , I am trying to reproduce a similar experiment using soil metagenomics data. You diversity scripts have been very helpful so far and thanks for that. I couldn't help noticing the scripts for alpha diversity are missing , could you update them for access?

Best

nallsing-salk commented 6 months ago

Hello @EorgeKit, Thank you for using the scripts and letting me know about this issue! I have added the alpha diversity scripts to the repository in a new Alpha_Diversity directory in the scripts directory along with input and output. They are pretty specific to the project, but I hope you will be able to use them. Nick

EorgeKit commented 6 months ago

HI @nickallsing Thanks alot for clearing that up. I am sure it will be of great aid. I had a question pertaining your kaiju analysis. Did you conduct analysis per sample and then combine the TSV files or you did your classification in a file that contains a mixture of reads from every sample. In my case I was using data from different locations from the soil. so each sample was sequenced independently and I conducted my taxonomic classification for every sample. To work with your NMDS code I had to put .tsv from every sample in one directory and prepared a combined OTU table. The trouble is because it's a soil metagenomics data, a lot of organisms were not observed although I used all kaiju databases to classify them. The resulting metagenome modified table therefore has a lot of zeroes since if an organism was found in one dataset as was the case most of the time, it is incorporated in the file with its count values but zero for the remaining two samples. This causes issues at the imputation stage on your code since so many columns in that case have a z.warning value of way greater than 0.8. This step is currently failing even when using your metagenomemodified file. Kindly advise if you have any suggestions on the issue. Sorry for the long message metadata_mappingfile.txt metagenomemodified.csv

nallsing-salk commented 5 months ago

Hello @EorgeKit,

Apologies for the delayed response. It looks like since the time of me writing these scripts, cmultRepl has added or implemented the z.warning and z.delete parameters. I reran the script with my metagenomemodified file, this time adding z.delete = FALSE to the cmultRepl line, and got the same output as before. I have updated the script to now include that parameter. This should also help with your samples with a higher number of 0 values. I also tested it out on your .csv and .txt files and you will need to remove the Viruses row before running cmultRepl since all 3 values of that row are 0. Please let me know if you have any additional questions.

Nick