monarch-initiative / dipper

Data Ingestion Pipeline for Monarch
https://dipper.readthedocs.io/en/latest/
BSD 3-Clause "New" or "Revised" License
57 stars 26 forks source link

Weird NaN label as double type #433

Closed jnguyenx closed 7 years ago

jnguyenx commented 7 years ago

While working on the search solr core, I get this weird label typesd as a double which breaks my code. It expects a String.

bgee.ttl:427506318:ENSEMBL:FBgn0036414 a OBO:SO_0000704 ;
bgee.ttl-427506357-    rdfs:label "NaN"^^xsd:double ;
kshefchek commented 7 years ago

The gene symbol is "nan" - pandas might be converting these to null NaN values. We should remove label assignment from BGEE since these should be coming from elsewhere. We also need to replace ENSEMBL:FBgn with FlyBase:FBgn, or establish equivalence between these two.

lwinfree commented 7 years ago

Oh that's a horrible gene name for data analysis >.<

On Mon, Mar 6, 2017 at 3:57 PM, Kent Shefchek notifications@github.com wrote:

The gene symbol is "nan" - pandas might be converting these to null NaN values. We should remove label assignment from BGEE since these should be coming from elsewhere. We also need to replace ENSEMBL:FBgn with FlyBase:FBgn, or establish equivalence between these two.

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/monarch-initiative/dipper/issues/433#issuecomment-284574731, or mute the thread https://github.com/notifications/unsubscribe-auth/AQ1vGTZJEOUfI5ggejXgowQ3xFHlTymRks5rjJ1kgaJpZM4MU2DG .

cmungall commented 7 years ago

I can't believe our beloved python pandas is doing this. How could you pandas, I thought you were better than excel.

On 6 Mar 2017, at 16:08, Lilly Winfree wrote:

Oh that's a horrible gene name for data analysis >.<

On Mon, Mar 6, 2017 at 3:57 PM, Kent Shefchek notifications@github.com wrote:

The gene symbol is "nan" - pandas might be converting these to null NaN values. We should remove label assignment from BGEE since these should be coming from elsewhere. We also need to replace ENSEMBL:FBgn with FlyBase:FBgn, or establish equivalence between these two.

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/monarch-initiative/dipper/issues/433#issuecomment-284574731, or mute the thread https://github.com/notifications/unsubscribe-auth/AQ1vGTZJEOUfI5ggejXgowQ3xFHlTymRks5rjJ1kgaJpZM4MU2DG .

-- You are receiving this because you are subscribed to this thread. Reply to this email directly or view it on GitHub: https://github.com/monarch-initiative/dipper/issues/433#issuecomment-284576931

kshefchek commented 7 years ago

Fixed with https://github.com/monarch-initiative/dipper/pull/435