biocore / metagenomics_pooling_notebook

Jupyter notebooks to assist with sample processing
MIT License
8 stars 16 forks source link

Create a generalized 'check_against_Qiita' function #81

Open charles-cowart opened 1 year ago

charles-cowart commented 1 year ago

Metapool has a function read_plate_map_csv() that optionally compares the sample names in a plate-map against a project's metadata in Qiita. This functionality is beneficial, but it would be better if the check came earlier in the wet-lab's workflow, especially since the people performing the earlier steps may be different than the people who would call read_plate_map_csv() in a notebook.

The wet-lab would like a generalized comparison function that takes in the same configuration info to talk to Qiita, a Qiita project id or project name, and a file-path or list of sample names. The list of sample-names may be manually generated by the wet-lab by reading the tube-ids, or from another file they get from a provider. MacKenzie and Rodolfo will provide examples. The function may need to be flexible and be able to read one of these file formats, as well as csv format and a basic list format.

The function will be similar to Linux comm command and return three lists of results: The items in Qiita that aren't in the user's list The items in the user's list that aren't in Qiita The items that are common to both.