tripal / tripal_analysis_expression

Extension module for the Tripal toolset to show differential expression data. This module was made for Drupal 7, Tripal 3, and Chado 1.3.
GNU General Public License v2.0
4 stars 11 forks source link

Biosample loader and CSV files. #381

Closed spficklin closed 2 years ago

spficklin commented 3 years ago

When trying to load a CSV file of runs obtained from the NCBI Run Selector using the Biosample loader I do not see any elements in the 'CVTerm Field Configuration' section. I don't get an error telling me anything is wrong either. It just doesn't do anything. Below is a screeshot of what I see.

Screenshot from 2021-09-09 17-48-07

I've tracked down the problem being the following lines of code

https://github.com/tripal/tripal_analysis_expression/blob/35a51746462a81b9a046aa81d8992d285961b1ee/tripal_biomaterial/includes/TripalImporter/tripal_biomaterial_loader_v3.inc#L597-L603

and here:

https://github.com/tripal/tripal_analysis_expression/blob/35a51746462a81b9a046aa81d8992d285961b1ee/tripal_biomaterial/includes/TripalImporter/tripal_biomaterial_loader_v3.inc#L619-L625

The problem is that "sample_name" is hardcoded as the column that should contain the biosample name. In the CSV file I obtained from NCBI the column is titled 'Sample Name' (capitalized and a space between it).

I will submit a pull request to fix this. But I think a good way to handle this would be to not hardcode a column name and let the user specify the name of the column that has the biosample ID. Otherwise it will always be a cat and mouse game of keeping up with how NCBI changes file formats. My quick PR fix will just be an updated regular expression to find "Sample Name".

spficklin commented 2 years ago

Fixed as per PR #392